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accurately .jnodelled with, two mechanisms: bugs ('"/Competence" phenomejfil^- 
reflecting mistaken beliefs about subtraction \jhich are s'table over ' ^ 
time) and -slips ^( "performance" phenomena loosely related to 
subtraction which are unstable over time), A computational ^ . 

descriptive framework, the "Buggy" model and a diagnostic program 
(DEBUGGY) were developed wherein bugs and slips modelled performance, 
describirfg the conteiit of wrong answers and the sj;^rp» taken in 
produc'ing tjiem! Students (N=925) study i rtg Sfubtract ion were tested 
using the diagnostic tests developed by DEBUGGY. Some students were 
retested two days later to measure short-term stability of bugs. 



whil^ others were retested several 
stability. All tests were analyzed 
diagnosticians> to assess DEBUGGY* s 
indicate DEBUGGY was as gt>od as or 



months later to study long-term 
by DEBUGGY, and by expert 
diagnostic abilities. Findings 
better than human diagnosticians 



at discovering bugs that explain students' errors. Other f indings 
challenge the belief that bugs and slips alone account for procedural 
error data* In addition, predictions of Repair Theory (a theory that 
predicted which bugs would exist for/ a given procedural skill) were 
verified. ( Author/ JN) \ 
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Prior to the empirical studies reported here, it was felt, that errors in pro- 
cedural skills such as multi-digit subtraction could be accurately modelled 
with tv T^echanismi: bugs and slips. Slips are taken as "perf ormancg" phe- 
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unstable over time and only loosely re-- 
ccur in. Bug4 are "competence" phe- 
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Some students were retested two diys latfer tV measjire tjafe short ^ 
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it was f oimd ^that DEBUCG?AJias atf^gpod^a^ or better thaft^^huj^ 
nosticiansi»at discovering >ug9 that explain a student 's|e 
ever, a third, of the stfudents wrho cdmaitted errors coq^^ 
with bugs and slips. Moreover, hugs were.fdund. in gto^feril^^. t^^^ 
stable rather than stable, both in the short term anid^ i^^^ - 
These findings challenged the belief tiiat bugs and sliiyp^^^^f^^ ' 
account for procedural error data. ; y v • 

Repair Theory was originally developed as a generative theory ol bug^, 
one that predicted whicfh bugs w^^^ f a given procedural skill. , 
However, it also predicted that eiertain k df non-bug, non--alip per- 
formance would exists both in the static analyses and the^stab^ity d^^^^ 
These predictions were verified. It iibw appears that all b^at ^i roall 
(but still significant), fraction o^f the phenomena can be precisely mpdelled 
using bugs, slips and tbfe mechanisms of Repair Theory. * ' 
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^ : ' ^ Abstract 
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Prior to the cmpiffcaf^^s^^^ it was felt frierrorj^in pr(Ked[urd skills sudX^^ 

mulD- digit subtraction cduW accurately modeled 'with tWo mechanisms: ^ugis smd Sips. Sbds 
arc^takcn as ''perfprm^inceV phenon\cna that were expected to be higWy unstable over tim6 ^ 
only loosely related tq the sWraction prob^cips tfioy occur ih. Bugs Jirc •'competence" phenomena . 
renectihg mistaken beliefs -abou^ the skill and as such, are-ejcpectcd tQ.be consistent across a whole 
test and stable, afcross tests given some days and even months apart,' A computational d^scripti);e 
framework/ the "Buggy" - paradigmv and a sophisticated! diagnostic program, DEBUGQY^ were , 
developed whijrein bugs and jUps. modeled performance at^very She level of detail, describing 
the content of the wrohg answers as well as the steps taken in producing them. ' . 
■ ■ / ; '■ ' .. • ■ ■ .. ^' ■ , ' - ^ , • , ^ ■. ' .\ ■ 

Jhis report presents the results of several extensive empirical studies, 925 students who Were in , 
the prbccss of learning, subt^tion were tested u§ing highly diagnostic tests developed by 
, DiiBUGdY. Some students wc^c retested. twd days later to measure the short-term stability of bugs,, 
and, others were rctcsted'several months-latef to study long-term stability. All tests were analyzed 
by DEBUGGY and by several jopett diagnosticians in order to assess DliBUGGY's diaj|^^ybaiti^v 

It was found that DEBUGGY was good as or betteif than human diagnostil|||||ip|^^ 

bugs that explain a student's errprs. ^Howcw;r, a third of the students whe committed* errors coiild ' 
not l^^mpdelled with bugs/ and slips/ Nio^^^ were found in general to be unstable rather 

. than ';|i^blc, both in ihV short' tern -and the vlong^' t^ These findings challenged the belief that 

' bugtf^-^Kd . slips alone could account for x procedural error data. 

'''-S^P'miii^'or:^ was originally developed as a generative theory of bu gE one thiat predicted, which ^ 
^^^^d„^ast f^r a given procedural skill. However, Mj|||MPiP^ ^'"^^ °! 

n^slip pcrlbrtnance would exist, both in thic'statg^^^pnSihe ^^!ty data. T|ese 
predictions' were venfied. It now appears that all but fiSWBy|plf'gn'f«cant) fraction of the 
phenomena ayi be jsreciscly modelled using bugs, slips and the mechanisms of Repair Theory. 
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. It has lpng?t)een Known that ixiany of jthe.ierrors that students make in exercising a prpcedur^d skOU 
^^Ch* as ordii^ry^lf^acetvaiue ; sub systcmaticf iri ihiat the errors appear to steiri from . . 

consistent applicltfp^^^ a faulty method; algoriih^^ or ^lule: These efrors^occur _^onjg withjhe . 
familiar unsyfteniatic o/r"^ errors whijch^w^ occur iu expert iJeiform^ce as ai^ 

th^ learner's bet^yiph^^^T^ Qommpn opiniorr is that careless er^rore,^r '■slips" as we prefer to call * ^ ^ 
them (c.f- Norman 1981), aire 'pcrforrrianee'phenomeria, ah inherent, part of the :^oise'' of the 
human /informatioh pt^pccssor; , Systemaiici^^ e^ pn the Pther hand are taken as ^mming from 
miistaken or niissing knowledge about the skill, the product o( inc^oaplete. or mi^giiic^ed learning. 
jBy studying theicij, insight ■ can ibe gairted igt0 the mysterious procesises of Jeaming and memory. It 
^js^so conimorily beUeved^^^^^^ there are a relatively sm^l number of systematic error? for any 
ipven skill, perhaps dozen or a hundred; and that once a student h|e jicqulfed one of these 
unfortunate habits, it will be held until it is remediated, Thf data reported here challenge some of \ 
these basic beliefs, whfle supporting others. ' / . ' 

In the last several vyears,^ a lairile scale invesQga^pn o^^ ^^SfflHlHjriH 
conducted, with a special emphasis oh systematic effiK It began >^m^oi?^^^iat ^raraw 
errftlj^ijiuld be formally replresented^ar^d precisely descpbed as "biigs" in a correct procedure for 
the skill: In brie^ a bug is a slight modifip^tion or perturbation of a correct pttk^rc. The bug- 
based notation is <omplete^ tiie sense that it not onlyj.j^escribgs which j||MeidM students gets 
wrong, but what-each wrong answ^i^ is and the ^^is W6w^^ 1^ the^stflBjB'in^^f^ it This 
fine grained descriptiPn raises a number of questions, such as: How many different bugs are 
there? How are these bugs^'acqiiijcd by students? 'How long are thpy held? What makes them go 
away? A question of fundamental .importance is whether tji^ere arc any students whose errors can 
be def>cribed neither as bugs nor as ^lips. This question challenges the foundational belieT that 
errors are. either systeniatic (i.e/ detenninistic, procedural) or unsystematic (ie- careless, 
unintended). *• ' -^"^ 

To answer such questions, a large number of student eprors-havc been^llected and analyzed in 
terms of bugs im(i^iips.^The^ d§ta were analyzed witii the aiil of debuOGY (Burton, 1981), a 
computer program^that can det^raiine whi^ bugs, if any, underlie a indent's- errors. Thus, in 
addition to reporting data that on potcirtfel psychol^ical and pedagogical theories, tiiis paper 
provides an assessment of a particular approach to computerized student diagnosis. 

.* ■ ■ * 
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This rcRort can be read at three levels of detail the first Icyci of detail weavi»'a summary of the 
results, in with an introduction to the bug formalism for describing errors and a descriptioii of a- 
recent theory. Repair Theory (Brown & VanLchn, 1980) that aims to explain the acquisition of . 
bugs. It is me&nt to W an glossary of the concepts iwed to analyze the data as well as a quick 
synopsis of the findih^ The first level of dctafl is sectiofT L The second level of detail is 
cbntainal In the following sections. Tbd dfiwq|^ ^ ||: ttw^ ^ 

methods for salhering and analyzing the data are discussed, jist cmqpirical predictions of Rq>air 
Theory are checked, aihd an eitehsive discussion of the adequacy of pEBUGGY-based diagnosis tt 
presented. The third level of detail is contained in a*^ of ^pcndices, v(hich present the data in 
; tabular form and discuss some relatively minor poiiits concerning its aggregation and tabulation. 

1. An introduction to tiie concepts ap4 the findings }^ 

Slips in arithmetic hardly need an ininkUKtiOtt since as adults all our arithmetic errors are .slips. 

We have all expJhei»ced the forgotten carry, the unnecessary borrow, and of course the ever 

presenL-fects errors" wheriein ooe mis-remembers an elementary mmlber combination. These 
' ^parently careless, unintentional errors also occur m the woric (if^dents learning ariflmietic. Of 

895 third, fourth and fifth grade students. -182 (20%) were ygynd by debuggy as knowing the / 
• correct algorithm for subtraction' but making one or mW slips during testing. 




When students were testeJ^pc a few days apart, half the students who answered all problems 
correcUy one day fuuie slips ttie other day. This finding confirms the impression that slips are 
unintentional, carttess mistakes in that a little extra care (or.somediing!) apparently makes them 
disappear. So, slips explain a to number of student*^ errors, and their unstable existence 
confonns with the' intuitive expectation' that they are due to "noise" in die processor rather than 
copceptual defects. ° . . 

What's a bus? * ^ 



The following problems display a systematic .error < ' ,J. 



1 



306 80 183 101 3006 7002 34 ' 261 

lafi - A ■ - 96 - 11 - 28 - 239 - f 4 - 47 
H "76 "^88 591 1087 * V4873 24 244 



One cduld vaguely describe th^ pn)blems' as coming from, a ^dent having trouble widi 
borrowing, especiafly in the presenk^of zeros. M<jre precisely, the student misses all the problems 
diat require borrowing fiwm zero. Jone could say that he has .not mastered ihe subskill of 
borrowing' across zero. This description of the systanatic error is fine at one level: it is a testable 
prediction about what new problems the student will get wrong. It predicts for example that the 
'smdent wiU miss 305-117 ahd win get 315-117 correct Systematic errors described at this level are 



' the data upon which several psychological and pedagogical theories have bee^llrilt (e.g. Durnin &* 
Scandura, 1977) It has' bccoine common to use testing progr^s ba^d on this notion for 
placement, advanfccment gnd rolPhcdiation ifl^ structured curricula, such as mathematics. Such' 
tesung programs arc often labelled "domain referenced" or "criterion referenced," 

'. • ■ . 

Once we look beyond what.*///d!y of exercises the student misses and look, at the actual answers 
given, we fincj in many cases th^ these answers can precisely predicted hy computing the 
answers to the ^iven problems using a procedure which is a small perturbatiort in the fine^structure 
of the correct procedure/ Such perturbations serve as a precise description of the errors. We* call 
them "bugs," * . ^ . ' 

1 - . . ■ • r ' ; 

The student whose work appears above has a bug called Borrow-Across-Zero, This bug modifies 
the correct subtraction procedure by deleting the step wherein the ^ero is changed to a nine during 
borrowing across zero (this bug and othejrs like .it^are described more thoroughly in appendix 1). 
This modification create? a procedure for answering subtraction problems. As a hypothesis, it 
predicts- not onlj^ which new problems the student will miss, but -also what his iiswers will be. 
For example, it predicts that the student above ^uld answer 305-117=98 and 315-1^ = 198. 
Since the bug-based descriptions of systematic errors predict behavior at a finer lej||l of detail than 
missing-subskill/domain referenced testing, chcy have the potential to form a better basis for 
cognitive theories of learning and crrore>^d perhaps a better basis for remediation or placement 
as well. ^^'"'^^T^^ • 

It is often the case that a student-has more than one bug ar tive^s^ Indeed, the example 

given above illustrates co-occurrence of bugs. The last two,problein^;;are ansv^ered incorrectly but 
the bug Borrow- Across-Zero docs not predict their answen (it predicts the two problems would be 
answered correctly). A second^Bugt called piff-N-N=N is present When the student comes to 
subtract a column where the top and bottom digits are equal, instead t)f writing zero in the answer, 
he writes the digit that appears in the column. ^ 

» • 

It often takes a set of bugs to form an accurate description of a student's* errors. Of the 417 
students that debuggy analyzed as having bugs, 150 (36%) had a multi-bug diagnpsis" Most of 
these diagnoses consisted of two or three bugs, but there were several cases of four bugs co- 
occurring. • So, DEBUGGVs ability to combine bugs to form an a&:urate diagnosis tunied out to be 
very important - 

A brief look at \he bug data ' • 

Overall, 77 distinct bugs occurred (by "occurred," 'weim'ean that a studeijt had the bug as his 
diagnosis, or if he was diagnosed as having a 'set of bugs, assart of his diagnosis). A few bugs 
occurred quite often. The most common bug by far was Smaller-From-Larger (this. 6ug never • 



borrows but instead amply takes the absolute difference in each column), ^t occurred 106 times 
alone, arid 18 times as part of mdlti-bug diagnoses. From there the frequency Pell off rapidly, with 
the next five most ^mmon bugS'coirung in at 67. 51.^. 22 and 19 occurrcnck About half the 
bugs (32) were quite rare, occurring only once or twicT This marked skew in the frequencies of 
occurrchce explains the impresaon left by informal soidics that there are only a dozen or so 
systematic errors. In fecu^ere are many more, but it took precision analysis of thousands of 
students to find them." This raises the question, how could so many bugs come to exist? 

Repair Theory: repairs ^•impasses 

Repair Theory iS a generative theory in that U attempts to explain why we found the bugs that we 
did and not odier ones, to explain how bu;3 are caused, and most importandy. to predict what 
bugs will exist for procedural skills we have not yet analyzed. Tliere are several benefits of a 
generative the<>ry. We could automatically generate a list of bugs* for a new skill «and add diese 
bugs-t6 DEBUGGY, creating a diagnostic «^em xsSLonA to die new skilL . We could ^ttack die issue 
of remediation of bugs with more than just a knowledge of what bugs a studedt has since such * 
theory would provide a plausible basis for understanding why the student htfl tiiose bugs. Stich 
an understanding could also help us deagn learning environments that might inhibit fbnnation of 
tfiose bugs in the first place. Fmally. in terms of cognitive research, such a theory would provide 
insights into knowlwige representations and cognitive mechanisms (e.g. skiD acquisition) that ^ 
direct observation. * 

Repair Theory\ based on the ^ghtjthat when a student gets stuck whfle executing his possibly 
mcomplete subtraction procedure, he is ^ unlikely to just quit as a computer does when it can't 
execute tiie next step,in-a procedure. Instead, die student will do a small amount of problem 
solving, just chough target "unsmclc" and Complete die subtraction probl^^ These local problem 
solving stia^es^ called "repairs" despite tile fict tiwt tiiey rarely ISR<4 in rectifying *e 
brokep pro&dure. Repairs are qm'te simple tactics, such as skipping t^e operation tihat qn't be 
performed or backmg up to die last branch point in die procedure^d taking a different pafli. 
They do not in general result in a correa solution to tiie subtraction pn^b^nrBut instead result in 
a. buggy solution. For example; suppjKe die student has never borrow^'tom W tW first time 
he is asked to solve a bongWsfeom-zetb problem, suclr ar^^5«=^x V ' ^ 



(a) 306 . * .(b) 30,5 (c) 





he begins pipcessiiig tiie units column by attempting to borrow from tiie tens column, and 
immediately reaches an impasse because zero^'can not be decremented. He's stuck so he does a 
repair. One repair is simply to skip tiie decremeftt ope^on. This leads ultimately to tiie solution 
s^own in"(b). If he uses tiiis repair to tiie borrow-firomrzero- impasse tiiroulhout a whole 



subtraction test, he will be diagn o sed as having th e bug Stops ^rrow-At-Zero, Suppose he- 
chooses a difTcrent repair, namely to.^relocatc the decrement operation and do it instead on a 
nearby digit that is not zero, such as the nearest digit to the left in the top tow, namely the three. 
This repair results in jhc solution shown in (c); the three has been decremented twice, once for the 
(repaired) borrow originating in. the units column, and once for the borrow originating in 'the 
(unchanged) tens cplumn. If he ahvays chooses this repair to the impasse, he will be diagnosed as 
having the liug Borrow-Across-2^ro. 

< "* 
Bug migratiQn and tinkering " 

'A 

Many bugs can be generated by this impasse-repair process (the exact numberliepends on how the 
sets of impasses and repairs. are constrained-^-see section .6). -flowever, we hardly expected to 
actually see tliis process in operation during the execution of one of the tests we gave. Instead,. we 
expected that the impasse ^and repair had happeried some time long ago and the resulting sequence, 
of steps had b^ome habitual,'' that is, a consistent,^ stable bug l^d been formed. Nonetheless, two 
predictions were advanced, tiamely t|iat a few students^ would be found who would repair an 
impasse several different ways bn a test (a phenomena vve label "tinkering"), and secondly, that 
some students would switch from one repair to another between tests, a phenomenon labelled "bug 
migration" because it would show up as a consistent bug on the first test, and a consistent but 
different bug on the ^second test The following three pseudo-tests illustrate these phenomepa: 
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Row (a) has, been ans^ed by a hypothWidal student with the bug Stops-Borrow- A t-Zero, row (b) 
T)y a student wth the btijTBorrow-Acro^-^^rqtoand row (c) by a tinkerer. All three hypothetical 
^students can not borrow ;^om zQrO^^ .'^^ only in how tliey repair the resulting impasses. 

The first repair^ by skippj^ tlte decre^^nt op^tiqn; The second repairs by decrementing the 
nearest non-zero digit to the -left of the zero. Tl^.tinkerer reacts by sometimes doing pne of these 
two repairs and sometimes the other. On^toblmis 1, 2 and 6, the tinkerer skips the troublesome 

decrements, producing the same answers as the swdent in row (a) who has the bug Stops-Dorrow- 

•/ .•■» h ^ J' 

At-Zero. On problems 4,'^5 and 7, the tinkerer refbcuses leftward, leading to the same answers «s 
row (b). (Problem three does not involve, borrowmg fro;n zero, so all three students answer it 



aiTOctiy,)/,The two Buggy sDidcnu always repair thcV^^b^ ijgiasses conastentiy/ 

wMe the tinkcrqf switches back and forth between two different repairs. In general a tinkerer can 
switch sbnohg several repairs. ■ ' - ^' 

Although DEBUGpY is not able to tell when a student Js tinkering, intensive hand analysis of 120 
students i^vealed that 14 (1^) of Uie^ itudoits were tinkering. So this prediction Repair 

Theory was verified ^ ^'^^^^ 

*^ , . . ' ■ . " ^ . - . .. " 

. • ^ t .• • • 

To observe bug migrarion, students are tested twice a short time apart If a student answered as in 
(a) on the first test and (b) dh the second test, then we would have a case of bug ndgration^^Tbe 
bug Stops-Borrow^At-Zero has ^migrated- into the bug Borrow-Across-Zcro. Only 67 students 
were tested in this two-test condition, and of tfiese only 12 were diagnosed as ha^dng bugs on both 
tests. However/of diese 12 smdehts, two (17%) exhibited bug migratidn, verifying the* predictions 
of the Repair Theory. ^ . 

Rcpair'^eory also predicts A^^dents can tinker <m one test and have a bug^ oii the other. 
That is, a student can answer as iii (a) on the first test and (c) on the secodd. He has the same 
impasse on botii tests, but whereas be repairs consistenlSy with a single repair on the first test, he 
^ uses two (or more) repairs on the second test When die two-test dau was examined bjr^d. 
four cases of diis phenomena were found. - ^ ' 

A summary of the j(miings ^ * 

* A model of the student population has emerged from the data based op the notions of ixnpdss^ 
i^aire, bugs and dips. Given just xme test, the students who are making errors can be put into 
fbur categorieis in roughly the following ^^reportions: 



7 50% Knows the correct algorithm; errors due to slips alone. 

, 30% ' Has a byg or a set of bugs (plus perhaps some sUps as w^^ ^ . 

10% Tinkering, using several repairs for one impasse (plus periuips some bugs and sUi^^ 
10% Errors can not be analyze(|, 

• 

These proportions vary with the grade leveL The above proportions are for third graders tested 
la^ in die year. In general, die older the smdent pq;)ulation, the greater the proportion of 
students in |he slips category and die smaller die proportion in die bugar category. In die eariy 

diird grade, for example. stud«its in die buggy category constitute over 50% of die sunple. 

. ■ * 

The various kinds of mora are expects to have differins kinds of shoit-tcnn stability. We expect 
slips for example to vary widdj? over two tests given a short time apart There may be m> slips on 
one test, and several on anotiwr. If Acre are slips on botii tests, they are not expected to occur on 
same problems, bnpa^ on ^e other hand are expected to remain in evidence across 4ests. 
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An impasse may show as a bug on bne test and on the next as a different but related bug, or as 
tinkering. What would' be unexplained is n bug that was present on one tes^ but absent on the 
other. These considerations prompt the following tabular summary of the percentage of«students 
exhibiting the vaorious^ kinds of stability: • ^' 

■ ^ •^ ■ • . • ' ^ . / , . 

4% No errors on either test ^ ' ' ' 

50% " Stable correct procedure; changes due to slips alone 

12% Stable bugs; changes diie to slips alone ^ 

1 2 % Stable impasscs^change^ due to repairs (often along with slips and stable bugs) 

12% Appearing and disappearing bugs and/or impasses (with slips and stable bugs) 

10% Errors on one or both tests can't be analyzed 



The stability patterns or the students in the first four categories (78%) conform to expectations, 
while the behavior of the students in the remaining two categories (22%) remain unexplained. 



In overview, these data show that the older view qf errors, as due to either bugs (deterministic, 
systematic errors) or slips (careless mistakes) is incomplete. ^The impasses/repair notions contribute 
substantially to our ability to understand error-filled tests (in jidditidn to their role as an 
explanation of the acquisition of bugs). ; 

However, a significant proportion of the tests. (10% of the static, one-test data, and ^2%. o^ the 
stability data) can not yet be analyzed ^ven Nvith these advances. Some of these students are iii'the. 
undiagnosed category because the tests were simply not Jong enough to give the analysts a large 
enough sample of their behavior to disambiguate the various possible explanations of the errors. 
In other cases^' species of behavior that have not yet been formalized werp apparent Some 
students appeared to ''•game" the test by struggling througft the first part of it, then giving up and 
using some easily executed bug such aGs Smaller-From*£arger on the rest Other sources of errors 
were rather uninteresting— there seemed to be several cases of 'cheating by loojcing at a neighbor's 
paper; in one case, a skipping ballpoint pen apparently caused a student to lose track of his 
procedure in the middle of several problems. In short, there will undoubtedly be some errors that 
have rather uninteresting causes and hence can properly be left unanalyzed in a formal descriptive 
study of errors. Our belief, is that we have not quite reached that level of understanding yet We 
guciss that there remain some undiscovered, interesting , mechanisms that will ftirther our 
understanding of errors^ as much as the impasse/repair process did. 

2. Background and motiYations / 

This section and the following ones provide , a more detailed description of the findings and how 
they were obtained. Special attention is given to evaluating the DEBUGGY's diagnostic ability and 
discussing how it could be used in practical educational ^settings. To set the stage, a discussion of 
the history and motivations of the research is presented. 



Many studies of systematic errors in aritlunetic preceded the Buggy studies (Buswell, 1926; 
Brueckner. 1930; Brownell. 1941; Roberts, 1968; Lanlcford, 1972; Co?. 1975; Ashlock. 1976). In 
aU these studies, systematic errors were thougiit of as incorrect or faulty algorithms with the same 
inputs as the correct arithmetic algorithm. In particular, systematic performance was assumed not 
to depend on the position of the test item on the page, the nature of the preceding item, fatigue, 
or anything else. This assumption is shared by the Buggy studies. To do otherwise would require 
orders of mj^tude more data per subject so that the influence of these context variables could Be 
studied. Since we share the belief of our predecessors that die influence is negligibly the context- 
free assumption has been built into the bug notation. 

The Buggy studies differ from tiieir predecessors in that a precise, formal notation for ^jsjematic 
errors is used. All die eariy stodies relied upon informal English descriptions of tiie observed 
systematic errors. However, even tiie most precise natural language descriptions are often flawed. 
For example. Cox used die description "Borrowed from the tens column when it was unnecessary" 
(Cox, 1975. pg. 155) to notate die follovying behavior (ibid., pg. 152): 

37, 43 86 

zJ. zi 

23 32 72 

From Aese problemsi it is clear diat die tens digit in the answer is off' by one. but it is not dear 
diat extra borrowing is the culprit Radier, it could be that the student diinks that a subtraction is 
necessary in each column, so a one Is subtracted in j:olumnsJhatJjave_aJbJa^^ 
is die bug Sub-6ne-Over-Blank; see appendix 1). Cox's description is adding some assumptions, 
to die naked observation of tiw behavior. Even if scratch marics are present and it is dear tiiat dw 
top row's tens digit has been decremented, it is not dear whctiier die student decrements every 
column except the units column, or only diose diat are over blanks, or only die leftmost column. 
The natural language description is seriously incomplete. On die odier hand, die syntax of die bug 
notation is such diat a bug could not be written widiout taking a stand on when .die extra borrows 
occur. This inastence on precision and completeness comes quite naturally widi a formal notation, 
and is a distinguishing characteristic of the Buggy studies. 

The idea diat systematic errors can be represented as sets of bugs became di? heart of a computer 
system named BUGGY. BUGGY had many facets. It could be used as a game to mtroduce stixdem 
teachers to die idea of bugs and to develop didr skill m discoverii^ systematic errors ip didr 
stiidehts* woriL buggy could also be used to analyze wrifen tests woriced by stodents in order to 
diagnme which bugs, (if any) die students had. It was used to analyze addition and subtraction 
tests from over a tiiousand sdidents. This eariy researdi is reported in (Brown & Button, 1978). 
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- Why subtraction? . 

Based on our early experience with BUGGY, a strategic decision was made to investigate one 
proccdwaf skill thoroughly rather than to cast about fpc examples of systematic errors in many 
domains at once. The procedural skill chosen for investigation was ordinary multi-digit 
. subtraction. Its main advantage, from a psychological' point of view, is that is a virtually 
meaningless procedure. Most elementary school students have only a dim conception of the 
underlying semantics of subtraction, which are rooted in the base-ten representation of numbers. 
When compared to the procedures they use to operate vending machines or play games, 
subtraction is as dry, formal and disconnected from everyday interests as the nonsense syllables 
used in early psychological .investigations were different from real words. This isolation is the bane 
of teachers but a boon to the psychologist. It allows. (|he to study a skill formally without bringing 
in a whole world's worth of associations. 

The goals of the studies 

Since BUGGY, the research developed in several directions. One direction was the development of 
Repair Theory, which was described above. In another direction, the technology for diagnosis was 
improved and extended by Richard Burton to become the DEBUGGY system, debuggy is able to 
produce much more elaborate diagnoses than BUGGY. In addition, it can analyze a set of test 
items to measure its diagnositicity, in the sense discussed below. Burton also developed an 
, interacUj^^ J^rsion of JDEBUGGY (called JDEBUGG y), _wh(^^ test items are generated by ^the 
system on the basis of the student's previous answers, thus allowing IDEBUGGY to converge on a 
diagnosis faster -aiiid with greater certainty. This line of research is reported in (Burton, 1981). 

^ In support of the theoretical and technological lines of research, extensive empirical . research h^ 
been necessary. This research involved collection and detailed analysis of a lai^e' number of 
systematic errors. This empirical invesUgation and its result are the topic of ' this pap^r. 



The two major goals of the empirical studies were: 

to pilot' test DFRVGGY: To evaluate DEBUGGY's diagnostic capabilities by using it on the 
kind of data that it would encounter if it were deployed as an diagnostic adjunct to the 
, curriculum. 

to test Repair Theory: To observe so many bugs that the database of bugs can be taken as 
an approximation of all the subtraction bugs that can occur. Repair Theory should be 
able to generate this set of bugs.- Also, the studies were intended to check for the 
existence of tinkering and bug migration. 
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ITicse are not the only us^ that the data can be put to. Indeed, wc have been asked so often for 
those dau that it seems worthwhile to present it irt as Gompletc and neutral a fashion as possible. » 
Thus, for dxample. data on the frequency of ikcurrcnce of bu^ will be .presented; although very 
little use can be made of this data at the present time vis a vis the above gqals. the frequency data ' , 
is important in design of remediation tools for education. Rather than present the data sUitistically : . 
or in some other summarized form tailored. to meet our goals, it is presented for the most part in '.^ 
tabular appendices. This allows other investigators to analyze the data as they see fit The text is ^ 
devoted to praenting the concepts involved in bug-level diagiiosis. the methods used to collect and : ,. 
.analyze the data, and commentary on the tables., * " 

3. Subjects and methods 

Two important development cycles began back in the-BUGGY days and continued throughout the . 
studies reported here. One was to extend DEBUGGY's diagnostic abilfty by augmenting its database 
of bugs, and the other was to improve the set of test items used to elicit errors. Before discussing 
the studies per set it is best to describe these development cycles. 

debuggy cannot invent new bugs. Its inventiveness is limited to creating new sett of bugs from 
known bugs. (Crcatinfe a new set of bugs may sound trivial, but it is actually quite difficult in 
general since bugs can interact with each other in complex ways.) Discovering new bugs is very 
important for testing generative theories since it is only by having as complete a database of bugs 
as possible that the gen^radve sufficipncy of the theory can be ascertained. Since . DEBUGGY does 
not-invent-new bugSi the method uscd-to-discover-new-bug ^ i s u s e DEBUGGY as a filter to remove , 
students whose behavior is adequately characterized by some set of existing bugs, leaving the 
human diagnosticians to concentrate on discovering any systematicity that lurks in what DEBUGGY 
considers unsystematic behavior. When even the barest hint of a new bug is uncovered by the 
experts, it is formalized and incorporated in DEBUGGY^ database. That way. DEBUGGY will 
discover any subsequent occuitences of the bug. even when it occurs with other bugs, and even 
with it interacts in non-linear, complex ways with thosei bugs. So the first cycle consists of 
computer analysis, human analysis, and augmentation of the bug databasaj 

Test diagnosticity * • 

The second cycle involves development.of Sighly diagnostic tests. The set of problems given on a 
paper and pencU tests is probably the most important determinant our ability do diagnosis. One of 
the facets of the DEBUGGY system is the ability to measure the diagnosriciQr of a test. Given an/ 
suljtraction test,, it can calculate exactly how many problems a bug (or set of bugs) will miss on a 
test and a2cti0fhvii pairs of bugs (or pairs of sets of bogs) can not be differentiated because 
they get exactly<' the same answers on the whole test It can do this even wheh the bugs in' a set 
interact in coinplex ways, such a producing a corr&tjanswer to a problem that each bug in the set 



would misa had it alone been applied to ;the procedure. Naturally, one wishes each potentiaF ' 
diagnosis to miss at least one problem on the test'tJo that it can be discovered. However, because 
stijjjents often make slips, one wants each bug (or set of bugs) to miss several ptoblems on the test , . ^ 
so that it can be discovered even if some of the problems it misses are not matched exactly by tlie • 
student's answer due to his careless mistakes. Similarly, one wants redundancy in the problems / 
that allow two differjcnt bugs (or set » of bugs) to be differentiated from each other. 

■ • ' ' ' ^ ■ ' ^' ' 

DEBUGGY's measurement of test diagnq^ticity is just as dependent on the bug database as ite 
analysis of student errors. If a test passes the diagnosticity test, this only guarantees that any 
diagnosis that can be constructed from known bugs will be distinguishable. It makes no such 
guarantee about nOw bugs. Hence, th6. second development cycle has been to upgrade the paper I 
,and pencil tests as new bugs are discovefed. ^ ' . ' 

' ■ ' ' *- 

Qjy'e^ Jjl^ mutual dependence of the data and techniques used to acquire it, it is worthwhile 



extinfiii^ Ihe cycles from the very beginning. 

■• ■ " — ' '■ ■ ■ 

The T^iiafaguan and Wellesley studies 

; The original database of bugs was developed by Richard Burton, Kathy Larkin and John Seely 
, Brown from a collection of 1325 Nicaraguan students' test results (this is the data reported in 
^ Bibwn & Burtpn, 1978). Two problems hampered them. One was that they did not have the 
actual test papers, but on ly each stude nt's answers to the tests. Althou gh /pEBUGGY makes no use 
i of the §cratch marks that students use, human diagnosticians seem highly dependent or them. The 
j^^oYid problem faced by the inyjestigators was that the diagnosticities of the tests were not high. 
, -/nTfkspi these handicaps, Burton, Larkin and Brown were able to identify 43 bugs while 
^ * %i ?;',^tjtt9intaining confidence that these bugs were not the product of pure speculation. However, the 



St .important effect of these bugs was to start the cycles rolling. Now better diagnostic tests 
iild be developed, and a .tijghter filter on new bugs could be used. 




Nicaraguan students were fourth, fifth and sixth graders who had been taught subtraction by 
(Searle, Friend, & Suppes,^1976). The wide variety of bugs discovered in their work made 
us wonder what kinds of bugs we would find in Americaii students who had received normal 
classroom instruction^ A study was conducted with 288 students ft"om Wellesley, Massachusetts 
(Haviland, 1979). Although the actual test papers were available, the lack of test diagnosticity 
continued /o plague the investigators. A second problem arose in that very few students made any 
errors at all, probably because the students were sixth graders from an upper class conjmfinity that 
could be expected to have a good schooj system. Consequently, only a few new "^bugs were 
discovered. 

Capitalizing on our past experience»^two extenisive studies were planned and executed by Jamesine 
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Friend and the present author with the Assistance of Richard/ Burton, John Scclyv Brown, and 
^lizab'cUi Berg, Highly diagnostic tests developed with the aid of DimuOGY were used. The test 
papers were available to the human, diagriosticians. 'ITie students were mostly third and fourth, 
graders frpm communities of average social and economic status, with school systems of average 
coifipctency. ' ^ - / 

• . , ■ • -• ■ • . • . ^ ' , . ' 

Several tests were develppcd with DEDUGGY's help for these studies, the first two tests were for 
interim use. They were replaced when enough new bugs Jhad been discovered to make their 
diagnosticity unacceptable. The tests that we ended up .with had twenty items, and most items had 
three or four columns. Here is one of them: ' ' 

647 885 83 8305 50 662 742 106 716 1564 6591 
45 ' 205 44 3 23 : ,/ 3 • 136 70 598 .887 2697 



662 


742 


106 


J- 3 '. 


136 


70 






I? 

4015 


702 


; 2006 


607 


.108 


^4 42 



311 1813 102 9007 4015 702 . 2006 10012 8001 
214 215 39 6880 . 607 108 42 214 43 



Every bug misses at least two problems, f and almost all missed three or more. This redundancy 
allo\fed DEBUGGY to detect niost bugs even in the presence of a large number of slips- The tests 
were also extremely diagnostic in that every possible diagnosis was distinguishable from the others 
by at least one problem. (For purposes, of defining "alL possible diagnoses." DEBUGGY was 
restricted to including at most two bugs in a diagnosis.) ' ^ 



-The-first-studyr^alled-the^5outhbay-studyr^ NtPI^ 9uantity__orncw_ 

bugs and to pilot test DEBUGGY, Eacl|^studcnt'& test was carefully examin«by at least one and 
often three human diagnosticians to corroborate or revise DEBUGGY's diagnosis. Also, a proportion 
of the students were retested several months later in order to soidy the long term stability of bugs. ) 
The second study, callel^the Short-tenn study, featured testing students twice, a short |ime apart 
(e.g. Monday and Wednesday). With this design, we could ii^yesiigate the short term stability of 
bugs. 

The Southbay study: subjects and test administration t 

During the 1979-1980 school yw 849 students were tested during the Southbay study. Two 
school districts in the southern San Francisco bay area agreed to participate in the stu'dy. Both 
school districts were a fairiy heterogeneous composition of social classes. The majority of the - 
children came from white, ^gish speaking families, although there were some Japanese, 
Vietoamese, Korean and RU^p children. There were a number of diildrcn with Spanish, 
surnames. Standardized test scores from the two districts show that one is slightly above the 
national norm and the other is at about the 70th percentile. 
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Eadh district office , sent notilJjps "5f to their elementary school principals, asking* for 
teachcrs^O;>?olunteer for the prdferam/ 33 teachers cooperated in the study. Althouigh the teachers 
were self-selcctedT we have no reason to believe that their students wore unrepresentative of their 
gr^de level's population. 

Most of the classrpoms had a single grade, except the Special Education classrooms, which 
included students in fourth, fifth and sixth grades. There were two combination classrooms, one 
with third and fourth grades, the other with fifth and sixth grades. ITie number of classrooms by 
grade level is: • * . 

grade , number of classrooms 

Third ^ 15 

Fourth . 11 

Fifth ' ,4 

Sfpecial Education 3 

■ . . . 

. ^ ... * 

TQSting of the classrocyms was spread over several months in order to ease the load-on the 
diagnosticians. A few jdays before the tests were to be adininistered in a school. Friend went to 
the school to conduct a brief teacher training session to acquaint the teachers ^yith the nature of 
the study, the tests, and the test administration precede teachers were asked not to impose 
a time limit on the children, but to allow thc^i ab uiut time as they needed, (Teachers reported 
to us that the test typically took 15 minutes ok less to complete, but there were some children who 
finished it in 5 minutes and others who took a half hour.) Theieachers were also asked to instruct 
"ffie students toT^spohd to eveify~item~cveii " if ' they7werr~feDr~sure-ther-tTO 

Each classroom's file of student tests was analyzed by debuggy, DEBUGGY's analysis was then 
checked by hand, always by at least one person (Friend) and frequcfntly by another (Burton or 
VanLehn). The objective of the hand analysis was to discover new bugs and to check that 
DEBUGGY's diagnoses * were reasonable, debuggy's analyses, with any correctiops that the 
diagnosticians felt were necessary, were compiled into a report fdr the teacher. This report, 
together with a teacher's guide that explained how to use it, was mailed to the teacher. 

The teacher's guide (Friend & Burton,^ 1980) contains a few pages for each bug, describing die 
bug, giving examples of the errors itjproduces, and sometimes making suggestions for remediation. 
A page of DHBUGGY-generated exercises is included that is suitable for remediating the bug in that 
the exercises cause the bug to exhibit' symptoms. ^ 

The diagnostic reports and the guide were seen merely as the incentive for teachers to participate 
in the study, and not as a serious attempt at deploying bug-level diagnoses in the classroom. 
However, 'informal fbllow-up interviews with some of the teachers exposed some interesting 
problems that deploying this kind of detailed diagnostic information in the curriculum appears to 
face. These problems will be discussed in a later section of the paper. 



To suidy long-tcnn suibility of bugs, sdfnc of the siudcnLS in the Soiithbay study were reported to 
the teacher's as needing reiesting. Whether they actually were rctcsted was up to the teacher, and ' 
ultimately only 154 students were retestod. A student Was recommended for retcsting if either (I) 
he could not be diagnosed, or (2) his diagnosis includ6d one of the less common bugs. In 
rGtrospcct. it w6uld perhaps have been better to ask the teachers at the oiAset to retcst their whole 
class at a later date. However, it" was felt at the time that they would not be receptive to this since 
recent concern with educational quality has mandated a very large number of. tests, and it was felt 
that teachers would not want to include yet another test in their crowded schedules withoMl a very 
compelling reason. However. 13 of the teachers ' chose to do the retcsting recommended to dicm. 
and one of them asked to be allowed to retcst her whole class since she found that easier than 
retcsting only specific individuals in the class. It appears that we could' have asked teachers to 
rctest their whole class and gotten eqiiilly high or higher participation. ^ 



I The Short-tenn study: subjects and test administration 

The second study to be reported here was designed to test uie short term stability of bugs. Tcsjs 
were administered two days apart, gencirily on a Monday and the, following Wednesday. 
Teacher's were asked not to give any Wtruclion in subtraction between tests. 

» - ■ ' 

To control for the possibility that students would remember the previous test's answers rather than 
recalculating, four testing conditions were used. In condition I. students received exacUy the same 
test form both days. In condition 2. students received the same problems, but in a different order. 
— In-eendition^v-thc-Qrdci^M>roblcms.Jwas^c^nic.Jmt each problem was changed slighdy. Jn^ 
condition 4. both the order and content of problems were changed. In short, the conditions 
differed slightly in the content and/or. order of the items. However, it turned out the test scores 
did not improve significantly in any of the four conditions (p > .10 in all four cases, using the 
Mann- Whitney U test). 

The subject acquisition, test administration and' analysis procedures were as described for die 
SouUibay study. Only three third grade classes electeci to participate in the study since jthe school 
year was almost over. A total of 67 students completed both tests. 

Data analysis 

Bugs, are mo^iifickons of some correct procedure for Uie skill. To represent a systematic error, it 
is necessary jto state whic|i correct procedure is being modified by the given set of bugs, in the case 
when more than one correct - procedure is possible. There are several different subtraction 
procedures taught in different parts of the world. Moreover, even in the^Inited States, several 
variations of the "standard" procedure are taught, differing particularly Mj||h regard to the use of 
scratch marks. However, since our largest sample populations were drawn from school systems 



thiU liuislu similar siibtmcdon procedures, wc have able to represent the subtraclihn data 
using just 'one correct procedure. Nonetheless;, the ^uabase and the diagnostic programs-are 
designed to handle multj|)Ic alternative correct procedures." 

ni-ndcGY analyzes a test in three stages. Fmi-si, it grades the test. The students who make no 
errors on tlie test are, placed \\\ ^\^c Correct category. Next, the set of bugs that fits the errors the 
best is foimd. Lastly.. onnuiGY decides whether the fit is good enough to put the student in the 
Bugg}' category. If not, then it decides whether tliere are few enoiig: errors that the student can 
be put in \i\c Slips category. If thcro ^vc too rrnny c ^rs, then -c student is placed in the 
Undiagnosed category. ^ 

'ilnc criteria used to assign subjects to the Buggy, Slips and Undiagnosed categones are complex^ 
and ad hoc. They were designed tojiiimic the intuitions of human diagnosticians. The following 

is a rough characterization of it (see Broy/n & Burton, 1978, for a complete treatment). 

'■ ' ■ ■ k 

• Rules for assij^nin^ diagnostic category v 

1. If the diagnosis predicts all the student's answers, both correct and incorrect, then he is 
assigned to the Buggy category. 

'2. Also, a student is assigned to the Bugg}' category if (i) the diagnosis makes more true 
predictions than false predictions and (ii) makes "enough" true predictions about wrong 
answers. The latter criterion meant to prevent a student from being diagnosed as having 
bugs when in fact his wr ong answers arc equally well predicted by the'hypothesis,tha t he 
IS pcrfomirigHie correct algorunm with an overlay of slips^ Since the buggy , and Slips 
hypodieses agree whenevcfr the bugs predict a correct answer, it is the problems where the 
bugsprcdicts wrong aViswers that split these two hypotheses. ICEik^itcrion of "enough" 
irue^^^^dictions of wrong . ansNvers is implemented by the followi/ig three conditions: 

*a. Of die answers predicted to be wrong by the bu;gsr^% or more are indeed 
the answers given by the student (i.e. are true^/predictions). 

b. Of the answers predicted to be wrong by the bugs, 50% or more are true 
predictions, and there is only one bug in the set of bugs, and more than half the 
wrong answers given by ^ the student are pfjsdietcd by the bug. 

c. Of the answers predicted to be wrong by the bugs, there are more true 
predictions llian there are false predictions oPolLkHrtls, and there are at least two 
true predictions. 

3. If a student is not assigned to the Buggy category by rules 1 or 2, and he has gotten 
90% of the problems correct, then h.e is assigned to the Slips category. 

4. Otherwise, he is assigned to the Unsysiemaiic category.' 

* ^. * • 

The actual algorithm used by DEBUGGY is more complex than this. DIIUUGGY h^ an ability to 
model certain kinds of slips which it uses to decide which of two competing sets of bugs is the best 
fitting diagnoses (Burton, 1981). Slips are used to temper, the decisions of rule 2, although they do 
not in fact play a role in rule 3 despite the fact that rule 3 decides between the Slips category and 
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' the Undiagnosed caicgorV. Thc^ emphasis in the debuggy research has been on modeling bugs, 
not slips. • ' ^ -t N 

As mentioned above, the database of bugis grew during the Sbuthbay and Short-term ^-studies. 
.Since the contents of the database effects DEBUCGY's diagnosis, all the tests from both studies were 
reanalyzed by debuggy at the conclusions of these studies, after the newly rdiscovercd bugs had 
been installed in the database. The data from the reanalysis are the ones reported in this paper. 

4. Evaluating DElluGGY's diagnostic expertise . - 

<■ . ^ ■ 

One of the main goals was to determine \yhether DEBUGGY could be relied upon to diagnose bugs 
as a- curatcly as expert human diagnosticiajjp. There were two , reasons to believe that it might not 
be reliable. First, DEBUGGY was not given the scratch marks that some students ^se to help, them 
do borrowing. Our experience has been that tlie scratch marks are invaluable to human 
_ diagnosticians. It was not at all clear that DEBUGGY could succeed given just the answers. The 
second area of uncertainty in DEBUGCY's design was the heuristics used to split Buggy students 
from Undiagnosed students, and to determine which of two diagnoses is a better predictor of the 
students answers. These can be a difTicult decisions for human diagnosticians, let alone TDEBUGGY. 

DEBUGGY versus the experts / 

^ ■ : , ' ' * 

Table 4.1 of appendix 4 (repeated on the next page) compares ;debuggy's categorization of 
"students with the humandiagnosticians' using the Sojithbiiy.data. This is a three-by-thrce table, by 
diagnostic category. If there were perfect agreement, all the off-diagonal entries would be zero. 
Although they are not zero, in most cases they ^are ^airiy small, with two exceptions. The experts 
found Buggy diagnoses for/36 (13%) of the students^that DEBUGGY considered Undiagnosed. 
These cases are tlue to the fact that the human diagnosticians could see the scratch marks, and 
hence could be more sure of the systcmaticity of the errors whi(fh DEBUCGY found but rejeaed as 
being not quite systematic. The other large off-diagonal entry represents 59 cases of the expert 
deciding that there was not enough evidence to rate a diagnoses as Buggy even though DEBUCGY 
thought so (these 59 cases represent 20% of DEBUGGY's Buggy diagnoses). Again, we believe these 
^ represent inherent differences between DEBUGGY anjd *e human diagnosticians' judgments caused 
* by the latter's access to the scratch* marks. 
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^ ' * ■ Experts* Diagnosis 

nnnuGGY's Diagnosis Slip^ Buggy \C Undiagnosed totals., 

♦ 

Slips ^ ' ' 148 (95%) 4 ( 3X) ' 3 ( 2%) 155 (10^%) 

Buggy . 3(1%) 233 (79%) • 59 (20%) 295 (100%) ' 

Unaiagnosed 18 ( 8%) 30 (13^) 188 (80%) 236 (r00%) 

totals 169 .267 ' Z50 686 

However, it appears that debuggy's classification ciigoridim is a goocj compromise in that about 
the same percentage" of students are mis-classified into the Buggy and Undiagnosed categories. 
^That is.' it is biased neither towards systematicity nor against it. This leads to the conclusion that a 
nearly optimal setting of the parameters of the classification algorithm has been reached. 

r<£0 'J 

Improved classification would require givirig dcbuGGY access to the scratch marks^ and/or a more 
complete .model of slips. , ^ 

r ..^ . ■ J, 

Another way that debuggy could be iniaccurate is by choosing the wroiig diagnosis for a Buggy 
♦ student when several diagnoses were in close competition. Table 4.2 of appendix .4 shows how the 
diagnoses given by debuggy were corrected by the human diagnosticians.- In almost every case 
(220 out of 233, or 94%), there was substantial agreement between the' experts* diagnoses and 
' debuggy's, and in nriany cases (193, or 83%) their diagnoses were identical . This shows that 
DEBUGGY was excellent at choosing among competing hyppthes€f§. In short, debuggy ^d the 
cxpejlS-agreej)ja-jyAa/^systeimtic-mor.xonsiste but .sometimes,disagrec on how-S^lemaiic-thc 
error is. ' . ^ 

DEBUGGY versus the experts with matched-item tests ► 

In an effort to probe DEBUGGY's expertise more deeply, we again compared if to ekpert judgments 
but handicapped iL Using the Short-tenn study, the experts were allowed to use both tests of a 
student in performing their diagnosis while DEBUGGY analyzed each test individually. Since the 
test forms were matched item-byitem. the experts could^see the same item answered twice^ and 
thus more easily come to an intuitive assessment of whether an error was due' to^a slip or a bug. 
In order to have a basis for comparison, two"" 6xpelii^ analyzed the tests independently. The 
differences between their diagnoses could be used as a baseline for the differences between 
DEBUGGY*s diagnoses arid expert ones. The results ar^ summarized in appendix 4, 

Roughly speaking, the experts agreed more with DEBUGGY lhan they did with ^ch other. The only 
substantive difference was that the Experts put more students in the Slips category than DEBUGGY, 
and fewer in the' Undiagnosed. This can be attributed* to ^their access to matched test items, which 
presumably allowed them to more confidently assert that npn-buggy errors came from slips rather 
than from some unknown caiise. 
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To sum up, it appeals that DEBUGGY using just the answers is as good as the expert diagnosticians 
using the answers plu? scratch marts, plus matched-item tests. The decision to allowing it no 
access iq the scratch marks appears to |iave, been, ojx balance, a good one. It makes the system 
prtctical because much less data (namely the answers alone) needs to be entered off the test sheets- 
Thc^'lack of this information does not appear to hurt its ability to fonnulatc a bug-level diagnosis 
at all, although it does appear to hurt its ability to assess whether a diagnosis deserves the Buggy 
classification (the difference in judgment could also result fifbin DE^uOGY's incomplete treatment 
of slips). ' , 

5. The Undiagnosed Category 

Whm all the tests in both die Southbay and Short-tcnn studies are considered, a large pqcentagc 

were assigned to the Undiagnosed category. Similar figures occurred in die Nicaraguan study, as 
shown below: 

Caiegory Nicamman SouihbayJU^I^orhiem 

V ' Correct 37 ^ «v ' 

' Buggy 606(39%) 417(40%) ' 

SUps 116( 9%) 223(22%) 

Undiagnosed 667 (62%) 386 (37%) 

totals 1325 1138 



•nic figures in paiwithcses are the percentages of di^ students who made errors fliat were assign^ 
toN^ ach category. In both studies, a substantial^ propoitioia of ihe population couljl not be 
^ ia g nnj;^ What is causing the errors in ffiese students' pcrfbnnancesjp^ ~ 

Tlie "math disabaUies" hypothesis 

One coiyecture is that Uie Undiagnosed students are not well practiced in subtraction, mafli 

phobic or "stupid" If test scores are taken as a measure of such jmto disabilities, tiicn there is 

no evidence for tiiis explanation. The scores of the Buggy and Undiagnosed students are almost % 

identical Figure 1 is a gP5}h of die distributions of test scores. One line shows the distribution 

for students in die Buggy cat^ory and thfc oflier shows students in the Undiagnosed category. 

Hiese distributions are very similar, except in two places. One is a large peak in die Buggy 

distribution at a fcore of diree answers correct This is due to the feet that die test had just diree 

problems diat did not lequire borrowing (only form 2 is covered by die ffzph, ance it was die 

most commonly used form). This peak is almost entirely due to the bug Smaller-Frcnn-Larger, of 

which diere were 106 occurrences. This bug misses all but three problems on die test 



24 



•Figure 1: The distribution of raw lest scores is ^lotjed^for the Buggy category (solid line) and the 
Undiagnosed category (dashed line). The x^-axis. is th^ number of problems answcr<^d correctly. 
The y-axis is the percentage of students in the category who obtained that score. The test had J9 
problems. Students who missed only one, or two problems are not shown. ^ < 
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The other 
peak* is due 

, laiser silps.^ 
testh; 

i who 



peak is in^thp Undiagnosed distribution, at Uic le-answer congsct mark, 
c students making careless mistakes. i.e. slips.' this^peak is the lca$iing'c 




We think this 

making careless mistakes, i.e. slips; This, peak is the Icajling edge qf a much 
k that would qontinde upward from 16 antwers'^corrcct to 18 answers correct (the 
blcms), except that DEBUGGY'is <dassifoation algorithm ^laccTundiagn students 
or? more problems wrong in the Undiagnosed category rather than the Slips one. 



This peak probably represents situdents who should have been pi^t^in the SliiJs category (more on 
• this conjecture in a irtomcnt). These two peaks aside, the distributions are very siniilar. When the 
points corresponding to these two peaks (namely the points at 3. 4. 15 and 16) ate left out, the 
cdirelation between me two distributions is :37 (p < .05). 




the t^ score -distribution^ sparked a more intensive l^pd aniriysis of the tests of 
e Undiagnosed categoiy. We akeady knew that it was difficult to, do better than 
just one tcst'Ccf. the precedmg section). Howe^r, it was discovered during the 
that much 1^ the; uncertaintiey in hand diagnosis were itmovcd by having two 
^fiie same test_ayailable, even though the ;esis were 'taken several days apartl^ Siiice 
^iSi i>i^^^'as ?nswe^ was easier to sqparate slips from repeated, conceptually *ased 

err6«t ^ lj# S^ng pain of tests, informative hand diagnosis of the Undiapiosed class proved 



t-term * 




exequuons' 




Its in the. Short term study were Undiagnosed on one or the other of their te^ 
-by-tbe-«Hhor--with-an-eye -ta finding^what-propo rt i on of t hem cw il d b e _ 
ficcounted 'rfbr with slips and tinkering. 



irhe major source of unsystematic performance appears to be facts errors and other slips (Nonn^ 
1981). .Slips are unintended actions, ^e person had the competence to chose and execut^a 
certain Action but, diiring performance, he did not In addition to errors jn the basie subtraction 
facts, one often^ sees cases where 



> a decrement is forgotten, 
o a digi£ is incremented instead of decremented, 
® the la^ columii of a problem is not processed, 
® an e,xtra, inappropriate borrow is performed, 

® a digit fs? dec^e^lented by twowhen the bottom digit in its column is a two, 
® and m^y others. K \ 



45% of tha'idi<^ts who were placed in the Undiagnosed category on the basis of a single test 
v^ere analyzed asTperforraing die correct algorithm wjith several slips. (These performajoces are 
particularly iasy, to pick out since slips do not in general appear on the same problems on both 
i!^. T^t IS, if errors, on the first test are not matched by errc^ on the same problems on the 
second test, then it is likely that the suidenfs errors are due to slips. Pertiaps diis observation 



^^db6W be used in constructing ai& for this style of two-test diagnosis, or for single-test diagnosis 

... _ _ ... . . . . . ^ 

AUSing redundant item) * ; , - 

■Slips occur on top of bugs as well as the conrcct procedure^ Here, their occurrence complicates 
and sometimes prevents DEBUCGY from reaching a diagnosis. That is, the underlying compptence 
is well modelled by bilgs, but slips jcreate enouj(h^ noise that diagnosis is blocked Our tests were 
designed so that aMost all bugs get at least three problems wrong, but there are n^y that miss 
six or fewer problems. Wheni^slips modifies a significant percentage of the answers tiiat the bug 
will cause the soident to get ^ong, which for some bugs means the student nebd slip dn only say/ 
three of the six problems it wiDiitdd,]^^ then ^erc will probably not be enough evidence (either 
for DEBUGGY or a human expert) to diagnose^ithe peifDr^ as fiuggy. However, the two-test 
samples allows slips to be uncovered eve|i when they are on top of bug& It was foiind that 14% of 
the Undiagnosed^ students h^^^ bugs diat.were covered by slips. ^ 

A third conjecture is that much unsystematic error is due to tinkering. 18% of the Undiagnosed 
students iii the two-test sample were found to be tinkering. 

The remaining 23% of the two^test sample gxhibited a pattern of errors that was too stable to be 
analyzed as duetto slips, but could not be accounted for as tinkering or slip-ridden bugs. The 
expert simply did not have enough data on ea(^ student's performance to reach a defini^ve 
diagnosis, eyen with the two tests. The source of the errors in their performance remains a 
mystery. : , . 

The ]^|6wihg table summarizes the findings given above: 

4S% ' Slips / t / ^ 

14% , Bugs . ' , ^ 

18% Tmkering - 

23% ' Unknowii ^ 
100% total 

The main reason that isb many students could hot be diagnosed by DEBUGGY is that they were 
making too many slips. However, there was also some tinkering present among the llndiagnosed 
students, as well as a non-trivial amount of truly puling 6ehavior. For DEBUGGY to dp better, it 
probably v^ould have to have much more i^fundant tdst items so that it could locate slips in the 
way. that the expert did, namely, l^y^ comparing a student's performance on identical or nearly 
identical problems. 



6. Repair Theory's predictions 

the main claim of Repair Theory is that it can generate all the objwrvcd bugs. In feet, it can not 
do this, but for a good B^n^ RepaU' patittliir prbccs, namd 

an impasc then patchmSt with some highly local prpblan solving. However; Ihis process is in 
effect parameterized set of impasses ahd repairs it is equipped with. The more impass« 
and repairs, the more bu^ it can generate. If enough imMSses and repairs were used, all the bugs 
could be generated. Indwd. we have coUected and precisely described an ad hoc but erapiricaUy 
sufBcient "^t of impasses ffld repairs. Some of these are mentioned in the appendix 5, and others 
are described in the "periodic table" section of (Brown & VanLehn. 1980). 

However, we would rather have princ/p/ei/ sets of impasses and rqiaiis, ones that are generated 
from some constraints imposed by die tasic domain or the learning sequence th* stodcnt is in the 
midst of. This would, for example, allow one to predict the bugs of thjt »?buld occur i^ 
procedural skill before any data were coUected. One highly principled technique for generation of 
impassesj^used in (Brown & VanLehn, 1980). But- it generates only a few of the needed 
impas^sTand consequently only 18 of the 77 observed bugs (the observed bugs areUsted in 
appendix 3 of this paper). Other principles for generating the sets of impasses and rejiaiis are 
being investigated that wiU hopeftdly converge on generating the ad hoc observation-based sets 
used in this paper. ~ 

Using co-occurrence frequenei^ as evidence 



In section 3.i5 of Brown & VanLehn (1980). the authors consider a new technique for generating 
impasses based on overgeneralization. The argument they present is interesting for the way it uses 
' the data. It makes a prediction aboufc4he frequency of bugs co-occurring with otiier bugs. That 
prediction . happened to be correct in the Soutiibay data, and the argument and its prediction is 
worth recapitulating here for the way it uses the frequency data. 



There are several bugs that could be generated by various repairs if only an impasse occurred 
whenever die student borrowed into a zero. That is, when the student comes to process a cohunn 
wiUi zero on top and a non-zero number on ±t bottom, instead of borrowing he does some repair. 
The bugs Difr-0-N=N and Diff-0-N=0 are two such bugs. (See appendix 1 for a description of < 
these bugs.) Brown and VanLehn suggest that these bugs occur because die soideht does not 
know how to borrow fiom xto. When such a student is asked to do a borrow from zero, he 
violates tiie precondition that a zero can not W decremented. He tiien Ovetgeneralizes die newlyi 
discovered! prohibition "you can't borrow from to mdude "you can't borrow into zcro."^ So 
in addition to hitting an impasse whenever he borrows, from zero, die Student's overgcncralized 
precondition blpcks borrowing in 0-N cohimns. causing an impasse which leads to die two 0-N 
bugs. Crucially, tiiis explanation predicts Uiat when die two 0-N bugs occur in compound 
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diagnoses, another bug in the diagnosis will invplye an inability to do bpitowing across zero (Le. 
recurslye borrowing). Close cxaminaUon of appendix 2 verifies this .prediction. The results are 
summarized in the foItowinj|&o tablw^ Igie first'counts the number of times the two DiflfeO^N 
bugs occurred with rckrursivc boitow bugs, and the second counts the number of times they 
occurred with other kinds of bugs. As predicted, there are more occurrences with recursive borrow 
bugs than with other bugs. 

Occurrences with Recursive Borrow Bug (plus others, in some cases) 

15 Borrow-Across-Zcro 

17 S tops-Borrow- A t-Zero 

3 Smdller-F|rom-Larger-Instcad-of-Borrow-From-Zcro 
i 'Borrow-From-Bottom-Instead-of-Zero 

36 total . ' 

Occurrences with Non- Recursive ^rrow Bug (plus others^ in some cases) 

.7 ^ Borrow-No-Decrenient 

4 , Smaller-From-Larger 
1 Difr-N-O=0 

1 j| Blank-Insteadof-Bonrow ^ ^ 

1 Borrow-From-Zero 

14 total . . \ ' ^ 



This kind of argumentation using the frequency of co-occurrence is interesting, and one we would 
like to be able to do rtiore of. Unfortunately, the sample is too small to see such effects except 
with extremely coJrmion bugs such as Difr-0-N=N. Most bugs occur less than a half dozen times, 
a quantity ^o small that differential co-occurrence frequencies are often not statistically meaningful 

Reifying the impasse- repair process \ . ' ■ 

Repair Theory was developed well before the Short-term study was cohductcd. At that time, it 
was unknown whether the impasse-repair process was actually a process the students went through 
during the course of solving some arithmetic problem, or whether it could not be reified and had 
to remain an essentially mathematical model, tlutt is, a formal specification of a set of bugs, 
implemented as a function whose internal chronology had nothing to do with students* real, 
temporal behaviors on arithmetic tests. A few students had ' been found who appeared to be 
tinkering, but the single test data was not redundant enough to elimirtete the possibility that what 
appeared to be shifting among repairs was in fact just slips. The Short-term tests were critical in 
finding out whether the impasse-repair process, or rather its chronology, eould be reified. Showing 
that the sequence of states traversed by the model while generating a bug could in some fashion be 
observed in the real, temporal behavior of a student who developed that bug would justifying 
claiming "psychological reality" at a much finer grain size. liamely the grain size Of the process 
itself rather than just its output 
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The data firom the Short-term study arc presented ^pendix i It is analyzed two ways, by 
DEBUGGY, which can not analyjte the data in terms of tinkering, impasses and repairs, and by an 
expert who can. When the data are analyzed by DEBUGGY, there is a moderate amount of 
instability: - ' 




Stable pnx:edure;bhangesdPb^ypsd^ ^ 
21 (31%) ^le:^rect algorithm 

2 ( 3%) Slible bugs 
23 (34?B) ' subtotal 

Stable impasses (pFus peiiiaps somt stable bugs); changes due to differing repairs 
2. ( 3%) Bug migration 

Unstable procedure; changes unexplained 

11 (16%y Coriect procedure changes to Undiagnosed 
19 (28%) Bugs appear or disappear (plus perfai^ some stable bugs) 
30 (46%) subtotal 

Unanaiyzable 

12 (18%) Undiagnosed both tests , 

67 (100%) total 

Although two cases of bug migration were found, these were embedded in so much unexplained 
^shifting among procedures that it is uncertain whether the bug migration cases were caused by the 
mechanisms that Rq)air Theory, postulated. II whether they were due to whatever it was tilat was 
causing so many other bugs to appear and disapp<^tf. To resolve this uncertainty, the 67 students 
were thoroughly analyzed by hand. The results came out somewhat diflEiBrendy: 

Stable procedure; changes due to slips alone 

36 (64%) Stable correct algorithm 

3 ( 4%) Stable bugs 
39 (68^5) subtotal 

Stable impasses (plus perhaps some stable bugs); changes due to difiering repairs 

0 ( 0%) Bug miration ^ 

6(9%) Tinkering on one test, bug on the other ' 

6 (9%) rmkeringonbothteists 
12 (18%) nibtotal . 

Unstable procedure; changes tmexplained 

1 (1%) Correct procedure changes to Undiagnosed 
5(7%) Bugs appear or dissq^pear (plus perhaps some stable bugs) 

7 (10%) Inipasses appear or disappear (plus perhiqw some stable bugs) - 
T3 (l9%) subtota^ 

. Unanaiyzable * 

4 (.6%) Undiagnosed both tests -i* 

• * * "* 

67 (100^1^ total 



30 



There is still a significant amount of unexplained shifting among procedures, but it js comparable 
in size to the amount of diifting explained by Repair Theory. A large amount of tinkering was 
found. Hence, it can. be concluded that reifying the iifapasscs-repair process is justified.^ 

7. Prospects for deploying bug*level diagnosis . 

As a user, the author found Burton's DEBUGGY program truly amazing. It performs much better 
than I can given the same input; in fact, none of the diagnosticians on the project feel they can 
match DEBUGGY's ability to do diagnosis when given only the answers and^ot the scratch marks. 
Despite the fact that is a very large, very complicated program that uses' Artificial Intelligence 
technology, it is extremely reliable. Typically, it is left running unattended overnight 

DEBUGGY was built as a research vehicle, rather than a practical way to deploy diaghostic 
technology in the Educational system. However, it has succeeded so well that it is worth examining 
the possibility of making it a practical service to teachers. ' 

Diagnosing the Undiagnosed 

o One feature is crucially important to add to DEBUGGY. In a third of the cases, namely the 
Undiagnosed students, DEBUGGY has nothing informative to tell the teacher about the student. It is 
only when the student is Buggy *at the teacher receives information about the student that can be 
used for placement or remediation. In the near term, bug-level diagnosis should be combined with 
standard criterion or subskill-based diagnosis (e.g. Friend, 1981) for the Undiagnosed students. 

Unfortunately, the tests used in thft study are totally inappropriate for subskill-based diagnosis. 
Subskill-based tests require test items which isolate each subskill from the others. Thus, if the 
ability to perform borrows in adjacent columns is considered a subskill, a number of items must 
require that subskill but not others, this allows the missing subskill to be identified by the binary 
pattern of right and wrong answers. Since almost every test item on our tests requires using 
almost every sub;ractipn subskill to solve it (which is exacfly what one wants in an optimal test for 
detecting bugs), the pattern of right and wrong answers rcvcaJs next to nothing about the presence 
and absence of subskills. Burton (1981) suggests it would be possible to construct a test that both 
isolated subskills while remaining highly diagnostic for bugs. How many test items such a test 
would have ^ have remains to be seen. 

Moving tq new procedural skills: the "new bugs" problem 

The central limitation on DEBUGGY is its inability to invent new bugs. Although it can invent new 
systematic errors, so to speak, by forming never-before^n sets of bugs, some of which have 
complex non-linear interactions, this is not the same as inventing a new bug. DEBUGGY helps 
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total 


45 


45 


15 


~ 60 


30 


90 


15 


105 


10 


115 



researchers find new bugs by filtering out students with known bugs, thus allowing human 
diagAosticians to focus /on the apparently iin^stcmatic students. 

However, this ndt a strong filter since a ttmd of the stiidentt are dasafied as Undiagnosed. TMs 
leaves the huihan ^ag^bstidans w^ a lafgc humbi^ of t^^^ 

Consequently, it requires a ^eat deal of effort to buQd a database of bugs the stit of the one the 
DEBUCGY now has. Uimping together the work of the six diagnosddans that have wm^ed on the 
project over the last several yiears» we estimate tiistfour or five thousand hours jwtjpc spent doing 
hand diagnosis. Have we found all the bugs, or are another several person-years necessary to 
guarantee axurate di^nosis as weU as insure that the database of bugs , is complete enough to 
adequately test g<^erative theories of bugs? ^ ^ 

The following t^e is a rou^ simunary of the rate at ^irfiich hew bug^ were discovered It should 
be taken as an estimate only since We did not record the dates thkt bugs were entered in the 

d?ti^baffg, 

Nicaraguan Study (December 1977) 
Wellesley Study (October 1979) 
First data from Southbay Study (November 1979) 
Further data from Southbay Study (April 1980) 
Last data from Southbay Study (June 1980) 

The original Nicaraguan study initialized the database with 45 bugs. ' The Wellesley study a year 
later added only IS bugs because the test was not very diagnostic and the students were too 
competent The Southbay study was extremely productive, addmg 55 new bugs to the database. 
Most of these came at die beginning of Ae smdy. One could conclude from the fact that die rate 
of new bugs being entered decreases that the database was convcfging. However, it spears to be 
converging rather slowly. In short, we probably have not fouijd all the bugs, and it could takes 
substantial effort to do so. 

This means that additional method^ should perii^ be loiight to reject generative theories of bugs 
when^they appear to be generating "^too many** bu|s^that don*t occur ih the database. That is, 
since the database can not be assumed to be com^)lete, the fycx that a particular theory generates a 
bug or even mahy bugs that are not|;in the database is not sufficient cause to refect that theory. A 
method based on absurd or **star** biigs is used to reject theories that overgenerate in (Brown & 
VanLehn, 1980)* Its merits and shortcomings are discused there. 

Although havmg a neariy complete database of bugs is very important for tesJig generative 
th^ries of bugs, it may be that it is not very hnportant fbr diagnostic applications in education if 
the newly discovered bugs turn out to be very rare. To show v^ether the new bugs Were indeed 
rare, the bugs in the table of bug occurrence frequendes in appendix 3 have been aimotated to 




indicate whether they were in the database at the beginning of the Southbay study. Although most 
of the new bui^occurred only once or twice, some were not at all tare. One of the very last bugs 
lb bcycntered, ()-N==N-Exccp^^^ turned oiit to^be the sixUi mo^^^^ 

bug upon rcanalysis of the Southbay data. So, the new bugs were not uniformly rare. To get a 
diagnostic instrument of even medium accuracy, discovery of new .bugs must be pufsucd vigorously 
through several iterations of the test development cycle. 

One possible way to reduce the need for bug discovery is to pre-load th6 database by giving the 
imagination free rein td invent as many conceivable bugs as possible. This was not done in our 
studies because increasing the size of the database increases the number of hypodiieses considered 
by DEBUGGY during analysis; on the computersinised originally, this caused DEBUCQY to run out 
of memory. We now run DEBUGGY on computers with much larger address spaces, so loading the 
d^^a^ with speculative bugs is feasible. 

However, ouf experience during the Southbay studi^ indicates that the imagination, even of 
experienced diagnosticians, is not powerful enough to invent some of the bugs that children invent 
FbT example, -take the bug Decrement-Multiple-Zeros-Dy-Number-To-RighL When borrowing 
across several zeros, this bug changes the rightmost one to 9, then next one to 8, then next to 7, 
and so on. It will answer problems in this fashion: 

\i ■ ■ 

30006. 
- 7 
27898 

This bug . seemed incredible to us when we first nodced|p^tt was c«|||l||bdy not a bug that we 
would have thought of before we saw it But having seen this bug, it seemed playsiblc that its 
.spmietric partner,' the bug Decrement-Multiple-Zeros-By-Number-To-Lell, might exist Both biiigs 
were added to the database. DEBUGGY found four cases of one and three cases of the other, lliis 
anecdote illustrates how data and the imagination both are necessary to fill out the database. 

Repair Theory ofTers a difTerent ^proach to pre*loading a database with bugs. Essentially, it 
provides a way of transferring the iniagination' aiij| experience expended on subtraction to other 
domains. The two bugs mentioned just above, for example, could be generated by Repair Theory 
when it is equipped with certain repairs. When ^e llase skill of the theory is changed fh)m 
subtraction, to for example, multiplication, the analogs of these two bugs would be generated. In 
this fashion, the database could be pre-loaded by Repair Theory; Those bugs in combination with 
those suggested by the imagination of experts might provide a fsdrly good initial set What 
proportion of the ultimate'set of observed bugs these would turn out to be is. at this point a matter 
of conjecture. 
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Micnrcomputer XXBUGCTi , N 

■ » •■ 

It has often bcfba suggested that a quick*and-dir^ ver^ of dequggy be built for a.micio- 
cqmpucef ^ ttuitit could beijsed by the teacher 1^ at home. 

By examining the Hlata, we can guess at how much diagnostic abiity one woi4d lose in such an 
implemcntatiDn. v • 

Appendix 2 shows that only a few sets of bugs occurred more than once. Of the 157 distinct sra 
of bugs, only SO occurred mim than once. The other 107 diagnoses occurred exapdy once. One 
might say that there is a long ""tail** to the firtquency distributipn. Of course, the bulk of the 
smdents fdl under the non*tail part of the distributiim; the SO mul^ly-oocurring sets of bugs 
accounted for 310 (74%) of the 417 Buggy students. 

One mfeience that can be drawn is that a diagnostic instrument that can only diagnose li|gh and 
medium frequency systematic errors wiU fiul tp diagiwp roughly a quarter of the studmls that 
DEBUGGY can diagnose. That is, if it is unable to diagnose the students in the Icmg taB, then it 
will miss 107 (or 26%) of the 417 students that DEBIK3GY diagnosed. Moreover/it is these rare 
bugs that most need to ,be diapiosed ance these are the ones teachers can't detect thereby causing 
the student ^to'^be perceived as "^r^ndom," wh^ in turn could perhaps lead to the students 
developing a permanent fear of mathematics. 

This lengdiy tail ovas also found in the Nicuagiian data. However, the fidl-off in frequency was 
pot as rapid as it is m the Soufhbi||^data, and the tail was not as long. We believe the c3q;>lanadon 
for this lies in the higher diagnosdcity of the tests used in the Soutfabay study, and in 
improvements to DEBUGGY and its database. Because the Nicaragiun tests were not as d ia g nof tfy :. ^ 
students that would receive disdna diagnoses using the Southbay tests actually received the same 
diagnosis. Similarly, DEBUGGY's increased prowess at discriminating bugs (due mostly to the larger 
variety of bugs in its database, but also to its ability to use upi to four bugs in a diagnosis instead 
of only two) tended to split students diat it once would have given the same diagnosis to into 
separate, although peiliaps overiapping, ' diagnoses. In short, yAita more accurate diagnosb is< 
perforaied, one finds that there are-more (fistinct diagm»es, CcMisequendy, the frequency of any 
given set of bugs tends to decfine towards one. 

This means that the worse the diagnostic tests and analyzer are, the shorter fbe taQji^on the 
distribution and the higher the frequency of individual diagnoses. Hence, a quick*and-diny high- 
frequency-only diagnostic progran would justify itself falsely. Suppose 4 diagnostic instnunent can 
only detect Smalle^From-Larger and a few of the oth^ high-firequency bugs. Then, students vrtio 
in fact have one qldie less common bugs that is similar to Smalle^From*LaIger, sudi as Smaller- 
From-Larger-Instcadof^Bonow-Fnm-Zen) as Smaller-From-LargerWhen-Borrowed or Bonow- 
Once-Thcn-SmaOer-^m-Larger, would be diagnosed as having SnwJler-From-Largci:. Hie 
instrument's assumpdrassdiat rhost students have one of the high firequency diagnoses would be 



unfairj^ vindicated. « . a ' 

Mocc i^lfcation^fbr a quick-and-di^ d^«oostic insOT 

that occur in the high-mid frequency diagnoses in appendix 1- Considering just the 50 diagnoses 
(f.e. sets of inigs) that occur more than oiice, one finds 32 different bugs. Thde figures become 
meathii^ful whcnb^ conipared to the 77 bugs that occurred ovaraH A large fraction of the bugsl[32 
of 77, or 42%) is necessary jusi to get the hijjb frequency sets orbugs. This has implications for 
the design of a qi<ick-and-dirty diagnostic instrument It says that such a* program would not 
benefit from including an aljility to do bug compounding, that is to cope with the non-linear 
interactions involvec^^in forming sets of bugs, dynamically. It would still need to stoiie 32 bugs just 
to. get the high frc^ency diagnoses; it would be cheaper in term^ of space and time just to sQ>it 
the SO high frcquen^^o^se/s of bugs. Of course^ this is incredibly risky in that slight perturbations 
in t|^e data pr^nte4 hele woiild boost some diagnoses over the single-occurrence line^ and others 
under it ./ . * 




jUjimg DEBUGGYi/ diagnoses ' ' 

'•'»./* . « . . ■ 

We do not knoM^ whether providing bug-level diagnoses to tht teachers will help them ''educate- 
•their students. /That effect depends crucially on what they do with the information. However, 
simply handing them the information does not appear to be effective. The teachers who 
participated i|i ourl||i^^es were given a teacher's manual (Friend & Burton^ 1981) that described 
eadi bug ia/cietail along with« foe some bugs, conifflon sense suggestions for remediation. A sheet 
of exercises, suitable for photocopying and deqped tq exhibit the bug, was included in the 
manual for each bug. In informal meetings with the ttachers after they had had their class's 
diagnostic report and the manual for some time, we found that in general they did mi understand 
the cbncept of a bug, and consequjcntly did not use the information we gave tbem as^well as^they 
could have, if at all 

This is ^^tent with our experience injxong tl^ Buggy game« a micro-computer game designed 
to teach the concept of a bug tb-^dent teachers (Brown & Burton, 1978). Players often 
conunented after playing tbt game that it they learned a great dea^ fhrni it, thaf it changed the way 
they thought of students' errors (Brown & Burton, 1978, appendix 1). In short, the bug concept is 
suifliciently foreign to te^ers that a certain amount of teacher training, perhaps as little as a 
couple of hours playing the Buggy game, is necessary before they can begin to'use the diagnostic 

information that DEBUGGY provides. ^ 

♦ 

Zongiiudinal effects ^ - 

Before one can assess the effects of bug-based remediation, it is necessary to know what happens 
naturally to bugs. In particular, do they persist throughout school or do current educational 



practices eventually remediate diem? . 

A relationship was {burid between bugs and grade level The following table shows, for each 
gr9db, hdw many subjccis wi^ assigned diagnostic category. (The students in, remedial 

and special education classes, Sf which there were 30, have not been included in this table, Bgjit 
sixth ^graders are included in the fifth grade since that was the grade level ^ which they were 
receiving instruction.) - v , ^ ' 

Grade, Correct Slips , Buggy Undiagnosed totals . \ . 

Third 32 ( 6X) , 64 (13X) 237 <49X) 148 (30X) 481 (lOOX) 
Fourth 48 (15X) 7^ (24%) 87 (27X) 104 (32X) 316 (lOOX) 
Fifth 19 (1«) 41 (41X) 13 (13X) 26, (25X) 98 (lOOX) 

totais 99 {itX) 1»2 (20%) 337 (38X) 277 (31%) 896 (100%) 

As one would expect, the proportion of the sample assigned to the Correct and -Slips categories 
rises as the grade level icicrcased from three to five. However, it was somewhat surprising .to find 
that the proportion assigned to the Undiagnosed category remained relatively constant across grade 
^ levels- It appears that the systematic errors of the younger students apparently become the 
systematic correct algoriilim of the older student, while current teaching practices seem to leave the 
Undiagnosed errors unrcmediateil, Should ihi^ trend in feet underlie the variation of thfe tablc^ it , 
would imply that> remediation ^ould be focused on the Undiagnosed smdent dian the Buggy one. 
However, the actual situation is much more complicated than tiiat; as tiic long term stability dau 
shows. ^ 

^ ■ ' •» - 

' ■# • 
. A long-term stabUity study was included as part pf the Soutiibay study. By long-term, we mean 

that the tests were givjpn at least two months apart, but still within the same school year. 

Appendix 6 presents the long-term data. Th^ basic finding i^that about hiJff the Bugsy students 

became Undiagnosed and about half die Undiagnosed students became Buggy. ^T\i\s casts doubt 

on the hypothesis that current teaching practices are succeeding in corrcctirtg sy^^atic errors but 

not in correcting unsystcnuitic errors. However, the instability in the short term makes these 

category figures somewhat suq)ecy It could be that tiiese.long term instability- figures are so 

buried in -short term instability'^>Uiat tiicy do not in fea reveal the underlying trends. So, we will 

have to leave the question of the efficacy of remediation on systematic error untH a better 

understanding of short-term stabi^iQr is developed 

Altiiough die changes in diagnostic category are a mystery, die changes in die bugs of; die 34 
students who were Buggy oh bodi tests follow die patterns that one would expect intuitively. 
There were 17 students who had die same or overlapping bug diagnoses on bodi tests. Appendbt 
6 shows dieir\diagnoses. The bugs in conunon on^ bodi ^ tests show long-term stability. Not 
surprisingly, die most frequentiy stable bugs are Smaller-Frpm-Larger, Stops-Borrow-At-Zero and 
Borrow- Acros^Zero, which are exactiy die three most common bugs. 
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ITicre were 17 students who had non-overlapping diagnoses on tlie two tests. In -most cases, tlic 
changes showed that the student was learning more about subtTijiction- For example, the students 
wl\o bad Smaller- From-l|liBer on the first test had bugs involving borrowing ot the test 
I^vidcntly. they had learned something about borrowing between the two 'tests, but not enough to 
do it perfectly. Another classic case is a student M/ho™ov«kl from Bqrrow-From-Zero to. Stops- 
Borrow-At-Multiple-Zero. lliis ' student evidently learned about borrowing across one zero 
between tests, but had not yet learned borrowing across several zeros— a skill which i$ tisually 
taught after borrowing across one zero. Of the 17 students, there were only four who di^'lifei 
show this pattern. 

8. Conclusions ^ ' 

r Prior to the studies reported here, it was felt that all subtraction errors could be modelled with two 
mechanisms: bugs and slips. Slips were a "performance" phenomena that were expected to be 
highly unstable oyer time and only loosely related to the subtraction problems they ocgjr in. Bugs 
were a "competence" phenomena reflecting mistaken beliefis about the skill and as such, bugs were 
expected to be consistent across a whole test and stable across tests given some days apart 

There v^re two areas of uncertainty marring this early view of a world composed of b.ugs atid 
slips. First, half the students that we had analyzed at that point (i.e.Jn the Nicaraguan stiidy) were 
not consistently fallowing bugs during the test We attributed this to the shortness o^Qie test and 
its lack of diagnosticity; If only had more data on each of those students, the l)u^ they had 
could be diagnosed. But there remaiped the possibility that these students were not really buggy, 
and that a thirjl mechanism woiild have to be added, to bugs and slips to model their behavior. 

The second uncertainty was raised by teachers who reported that they had seen bugs (or what 
appeared to them to be bugs) come and go over very short periods of time. It could be that bugs 
are not stable like bugs in computer programs in that they required active participation of the 
teacher or some other aspect of the learning environment to make them change. If they were not 
stable, then- a new mechanism would be needed to model how bugs changed so quickly with no 
outside intep^ention. In short, the parsimony of the bug-slips view of the world was threatened 

The Southbay study was conducted m\h excellent tests, an improved DEBUGGY, and a dedicated 
staff of experienced diagnosticians. This diagnostic power seemed sufficient to determine whether 
bugs and slips alone woiild be enough to model any individual's pcrfonmancc on a test, or whether 
a significant proportion of the population could not be analyzed in these terms. The latter 
hypothesis was correct: 34% of the population could not be diagnosed in terms of bugs and slips. 



The Short-term study, although conducted with only a small number of students,^ sufYkcd t9^J|:<^ 
whether bugs iterc in genci^ stable. They are not* Of the students who had bugs on the first 
test, only 12% had the same bugs on the second test, and some had po bugs at.alL There arc 
definitely rn^or short term instabilities in bu^ ^ ^ 

Data was also collected to compare the long*tenn stability of bugs by testing several months apart 
Roughly 'q>eaking, the long-term stability data is very similar to the short-term stability data, lliis 
implies that the bug instabilities could be a result of tesdqg itself rather than the time between 
tests. That is, short-term instability can not be attributed to students remembering the test from 
the preceding day and actively trying to do the t)rcsent^ test ^differently, nor, can long term 
instability be attributed to remediation or spontaneous remission of the bugs. Instead, it ^n>^ars 
that instability re^lults from studen^tidc^g a different "mental set"* for the^^^luration of eadi test 

This challenges us to change our hnage of a bug as something that necesarily exists over time as 
part of the child's long term belief about subtraction. Instead^ many students' bugs appear to oe 
manufactured as the test begins and held consi^tly for the duration of the test, only to be 
dismissed and evsqxirate from memory as soon as the test is over and they have served their 
purpose, namely to get the sbident through the test That is, bugs appear to be manifestations of 
rational, albeit incwect, problem solving strat^es for working the test As such, there is no 
reason for a student to retain the stratjftgy after the test is over; it or another like it can be 
generated again next time. (Indeed, practice may have exacdy the wrong effect here. When the 
smdent has just invented a bug. phctice may solidify the bug in memory thus making remediation 
more difficult) ^ 

This view is formalized by Repair Theory. It describes the kind of |ocal problems and their 
solutions that lead students to perform as if they, bad bugs. It has proved successful in explaining 
a certain percentage of the behavior that DEBUGGY could not diagnose; these students appear to 
have^bcen tinkering with various problem solving strat^es in the inidst xjf the^test, and thus did 
not exhibit consistent enough behavior for DEBUGGY'to characterize it with bugs. Repair Theory 
' yf^ successful in explaming the appaircnt instabilities of bugs. Students who pppBed one problem 
solving strategy consistendy on the first test and thus werc diagnosed as ha^(in^ bugs, often applied 
a different problem solving strategy to solve the same local problems on die second test, leading to 
a differcnt bug-level diagnosis on diat test Often, diejr chose to usc^ a Variety of problem solving 
strategies on the second test whki explains how they ian appear to have no bugs at'a^ on that 
test . ^ 

Despite the success of Repair Theory, there- arc sdll a substantial numjjer of students whose errors 
can not b^ explained; and an even larger number whose changing pattern of errors over two test 
can not * be explained. These arc fit targets for 'further dieorctical and empiricsa worit 
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Yet even at the bug lilh^ct the data present several challenges. The goal of explaiifing how such a 
wide variety of bugs could exist is being tackled by ourselves and others (Drown & VanLchn, ^ 
1980; Young & 0*Shca, forthcoming). Such an effort could lead to new. deeper Acories of skill 
acquisition. Others are investigating 'more immediate applications of tbe^ bug-level diagno^ 
Bunderson (1980) is doing controlled experiments to determine whether remediation can be 
improved if the tutors are given the stydenfs diagnosis by DEBUGGY. Resnick'has found seme 
methods of teaching that appear to successfully remediate J>ugs, and is looking into nysthods of 
teaching that will prevent bugs from being developed in the first place (Resnick, 1981X Some 
preliminary research (VanLehn & Brown/ 1978) has been directed tow^j^^uriderstanding 
theoretically Jiow a pfoceduie cduld be given meaning for the student, thereby blocking bug 
formation. Yet another challcilie^ th push the Buggy paradigm beyond place-value arithmetic 
into other kinds of problem solving and procedural skills* A simplified bug-like notation has been 
found sufRcient to represent systematic errors in signed-number arithmetic, and a diagnostic system 
for it has been developed (Tatsuoka, Birenbaum, Tatsuoka & B^lie, 1980). Evidence of bugs have 
been found in the work of students 'doing high school algebra (Carry. Lewis & Bernard, 1979; 
Matz, 1980). The concept .of a bug In procedural skills seems to have wide £4)plicability as a basis 
for developing new psychological and pedagogical theories. 

Of course, the real star of this investigation is pEBUGGY. It proved to have a seemingly 
superhuman ability to perform , diagnosis. Occasionally the experts differed with its opmions, but 
no more so than they differed among themselves. As 3 research tool, it was superb. However, 
there are problems with miploying it in W educational system. One is that teachers are ill c 




equipped to use its diamoses. Not only is the concept of a bug foreign to them, but there 
currently exist no remeaial aids or prognmfis that employ bug-level diagnostic information, ^ A 
second major problem is that it has taken ' a great deal of effort to accumulate the extensive 
database of subtraction bugs we how have. To duplicate this effort for each new procedund skill b 
aunting task. Repair Theory could be of help here in that it can in principle generate bugs for 
a new procedural skill with only a few changes. Testing this ability is just one of the exciting 
directions further cjesearch can pbrsue. \ ' 
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Appendix 1 
The Bug Glossary: 
A description of each bug 



O.N=0/ArTP.R/nORROW • . 

When a column b^s a 1 that was changed to a 0 by a previous borrow, the student writes 0 as the answer to that column. 
(914-486 = 508) ~ 

0-N=0/|£XCFPT/AITER/b6rROW ' 
Ihinis 0-N is 0 except when ihc column has been borrowed front (906 - 484 = 502) 

■ I • ^ . \ . ' ' ■ ■ * 

0-N = N/AFn-R/BORROW i^? 

When a column has a I that was changed to a 0 by a previous borrow, tl^ student writes the bottom diga as the answer 
to that columa (512 ? 136 = 436) 

0-N = N/BXCHEPT/AFtER/BORROW 

niinks 0-N is N exccpit when the column has bc^h borrowed fiom. (906 - 484 = 582) 

MsO/AfTHR/BORROW ' ^ ^ 

If a column stam with 1 in both top and bottom and is borrowed from, the student writes 0 as th^ 

columa (8U-518»3P4) . 

M=1/AFTER/D0RR0W ^ 
If a column Starrs with 1 in both top and bottom and is borrowed Trom. the studem writes!^ 

column. (812-518 = 314) . / 

ADD/BORJ^OW/CARRY/SUB 

1hc student adds instead orsubtracting but he subtracts the earned digk instead or a(ddi^ 

y (54 - 38 »72> 

ADD/BORROW/DECREMENT 

Instead of decrementing the stu4ent adds 1, carrying to the next column if necessary. 

8 6 3 8 9 3 

-13 4 , - 1 0 4 . ^ 

7 4 9' 8 0 9 

' ADD/BORROW/DECREMENT/WrrHOUT/C:AItRY - v v 

Instead ofdecrementing the student adds L I f this addition results in 10 the snident does not carry but simply wntes 
both digits in the same space. 

8 6 3 8 9 3 
, -13 4 - 1 0 4 

7 4 9 710 9 

ADD/INSTEADOF/SUB ^ V \ 

.Thcstudentaddsinstead of subtracting. (32-15 s 47) ^ 

ADD/LR/DECni^MENT/ANSWER/CIARRY/TO/RIGHT Adds columns from left to right instead of ^btiacting. Before 

writing the coIumn*s %nswer, it is decremented and truncated to the units digit A one is added into the next colunm to 
the right (411-215 = 527) . > 

ADD/NOCARRY/INSTEADOF/SUB ^ . . ^ . /^^r 

The s^dent adds instead of subtracting. If carrying is required he does not add the earned digit (47 - 25 s 6Z) 

ALWAYS/BORROW 

The student borrows in ^ery column regardless of whether it is necessary. (488 229 « 1159) \ 

ALWAYS/BORROW/LEFT 

The student borrows from the leftmost digh instead of borrowing from the digit immediatdy tto the left (733 - 216 s 
, 42?) V • ■ - 

BLANK/INSTEADOF/BORROW V 
When a borrow is needed the student simply the slops the column and goes on to the 

(425-283 « 22) ^ \ 
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.noRROw/ACROss>^rop/sMAiJ i:r/i)j:cri-mi-ot 

When decrementing a colutnii in which lhe*tO|> is smaller than the bottom, ihc student ndds 10 lo the top digit, 
decrements the column being borro>^ed inlo nnd borrows from the next column to the led. Also the student skips any 
column which has a 0 over a 0 or a blank^n (he borrowing process. ' > ' / 

1 8 3 5 1-3. ^ 

- - 9 S , '268 I 
9 7 J i 4 • ' / 

BOR?tOW/A CROSS/ZERO 

When borrowing across a 0. the student skips over the 0 to bQrr^from the next column. If this causes him to have to 
borrow twice he decrements the same number both times. 

9 0 4 9 0 4 , 

\ 7 - 2 3 7 . ^ > 

I 8 0 7 ■; "S 7 7 

BORROW/ACROSS/ZERO/OVF.R/BLANK ^ 

When borrowing across a 0 over a blank, the student skips to the next column to decrement (402 - 6 = 306) 

BORROW/ACROSS/ZHRO/OVER/ZERO ^ 

Instead of borrowing across a 0 that is over a 0, the student docs not change the 0 but decrements the next column to 
the left instead. (802 -304 ^ 308) 

BORROW/ADD/DnCREMENnr/INSTEADOF/ZERO * / 

Instead of borrowing across a 0, the student changes the 0 to 1 and doesn't decrement any column to the left 
(307- 108 = 219) 

■■ . ' ' . ; . ■ . / ^ i ' 

BORROW/ADD/IS/TEN 

l^e Ml/idqnt changes the niimber that causes the borrow into 10 instead of adding 10 to it. (S3 - 29 5^) " 

BOWO^y/DECREMENTING/TO/BY/E^^ . ' 

, ' ■ Whci there is a borrow across O's. the student does not add 10 to the colunin he is doing but instead adds 10 minus the 
number of OVborrowed across. , 

3 0 8 3 0 0 8 

' 1 3 9 - 1 3 5 9 . ■ , 

I 6 i ,1 6 4 7 . ' . 

■ ■■ 

BORROW/DIFF/0-N = N&SMA1.L-LARGE=0 ' . 

The student doesn't borrow, for columns of the form 0 - N he writes N as the answer. Otherwise he writes 0. (304 • ' 
179 = 270) - ; ; 

BORROW/DONT/DECREMENT/TOP/SMALLER " 

The student will not decrement a column if the top number is smaller than the bottom number. 
.'•732 732 

' 4 8 4 . ' 4 j 4 • 
~m 2 9 8 

, . Wrpng Correct^ , 

BORROWbONT/DECREMENT/UNLESS/BOTTOM/SMALLER - , 
, The student will not deaement a column unless the bottom number is smaller than the top number. 
'7 3 2 7 3 2 

'4- 8 4 - 4 3 4 ■ ' 

1 5 8 j 0 8 

BORROW/FROM/ALL/ZERO 

Instead of borrowing across O's, the student changes all the O's to 9's but does not continue borrowing from the column 
to the left (3006 - 1807 = 2199) 

BORROW/FROM/BOTTOM ' . . 

The student borrows from the bottom row instead oftlxe top one. ' 

8 7 8 2 7 . - ' ' . V ■ — 

- 2 8 - 2 0 8 ^ \ 

—T^ 8 3 9 

BORROW/FRQM/BOTTDM/INSTEADOF/ZERO . ^ 

When borrowing from a column of the form 0 - the student decrements the bottom number instead of the 0. 
6 0 8 .1 0 8 . 

• 2 4 9 4 9 * • , . , 

■ . 3 7 5 ■ . — n . ,.- . -y ' 
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BORROW/roOM/LARGER i . , ^ ^ i v 

When borrowing, the student diccrcmcnis the larger digit in the column regardless of whether it is on ihe top or ihc 
bottom. (872 -294 =r 598) ' 

BORROW/FROM/ONO/IS/NINE * ^ 

When borrowing from a 1. the student trcau the I as if it were 10, decrementing it to a 9. 
(316-139 = 267) - 

BORROW/FROM/ONE/IS>nniN ^ ^. ^' ^ 

When borrowing from a 1, the swdent changes the 1 to 10 instead of to 0. (414 • 277 = 237) 

BORROW/FROM/ZEJIO / ^ . ^ u . . .w 

Instead of borrowing across a O.the student chaises the 0 to 9 but does not conunu? borrowing from the column to the 

left ■ ' 

3 0 6 3 0 0 6 1 0 3 

- 1 8 7 - 1 8 0 7 ■ 4 S \ 

1 V9 12 9 9 ' 15 8 

V .. ." 

B0RR0W/FR0M/2ER0&LEFT/0K . . L . 

Instead of borrowing across a 0, the student changes the 0 to 9 but does not continue borrowing from the column ot the 
left. I lowevcr if the digit to the led of the 0 is o>pcr a blank then the student does the correct thing. 
30 6 30 0 6 103 203 

■18 7 ' 1 8 0 7 4 S -45 

I 1 9 < . 119 9 5^ 1 5 8 

Wrong Wrong Correct Cormx 

BORROW/FROM/ZERO/IS/TEN ^ . u , i. /r/v. 

When borrowing across 0, the student changes the 0 to 10 and docs not deaement any digit to the leit (604 • 235 a 
479) 

BORROW/IGNORE/ZERO/OVER/BLANK ' ..^ 

When borrowing across a 0 over a blank, the student treats the column with the icro as if tt wcien t there. 
5 0 5 5 0 8 ' 

7 7 
—5 ..5 0 1 
Wrong Correct 

B0RR0W/INT0)0NE=TEN _ - '/^^^ .n. 

When a borrow is caused by a 1. theacudeni changes the 1 to a 10 instead of adding 10 to it 
(71-38 = 32) . ^ ; . 

BORROW/NO/DECREMENT . ' ' . ^ 

When borrowing the student adds 10 correctly but doesa^t change any column to the left. 
(61^44 = 28) 

BORROW/NO/DECllEMENT/EXCEPT/LAST - « . • 

Decremcnts^ly in the last column of the problem. (6262 • 4444 = 1828) 

BORROW/ONCE/THEN/SMALLER/FROM/LARGER " ^ ^ , - - ^ 

The student will borrow'only once per exercise. From thcn^n he subtracts the tfnaUer from the larger digit m earn 
column regardless of their positions. (7127 - 2389 = 5278^ 

B0RR*0W/0NCE/W1TH0UT/RECURSE ' ^ ^ *u in 

J The student wiU borrow only once per problem. After that, if another borrow is required the student adds the 10 
' correcUy but does not decrement If there is a borrow across a 0 the student changes the 0 to 9 but docs not decrement 
V the digit to the left of the 0. ^ 

5 3 5 4 0 8 

« 2 7 8 ' 13 9 
3 i 7 1 i 9 

BORROW/ONLY/FROM/TOP/SMALLER • . . w .rw 

When borrowing, the student tries to find a cdumn in which the top number is smaller than the bottom. If there is one 
he decrements that^oiherwise he borrows cdrrtctly. j . 

(9283 - 3566 = 5627) , . ' ^ 

BORROW /ONLY/ONCE * ^* . 

When there art several adjacent borrows, Ihe sttidcnt decrements only w^ \ 

(535 - 278 = 357) • ^ ^ " 
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BOUKOW/SKIP/IiQUAL 

When decrementing, the student skips over columns in which the top digit and the bottom digit are the same. (923 - ,t\ 
All = 406) ^ 

nORROW/Tl-NVPLUS/Nl-XT/DIGIT/INTO/ZIiRO 

When a borrow is caused by a 0 the student does not add 10 correctly. What he docs instead is add 10 plus the digit in 
the next column to the left. (50 - 38 = 17/ 

BORUOW/-rRl£AT/ONI:/AS/ZIiRO " * 

When borrowing from I. the student treats the 1 as if it were 0: that is. he changes the 1 to 9 and decrements (he 
number to the lea of the 1. (313- 159 = 144) * f 

BORROW/UNIT/DIFF . ^ 

s The student borrows the difference between the top digit and the bottom digit of the current column. In other words, 
he borrpws just enough to do the subtraction, which then always results in 0. (86 - 29 = 30) 

BORROW/WONT/RECURSE , 

Instead of borrowing across a 0. the student stops doing the exercise. (8035 " 2662 = 3) 

BORROWFD/FROM/DONT/BORROW 

When there are two borrows in a row the studeiu docs the first borrow co^ectJy but with the second borrow he docs hot 
decrement (he does add 10 correctly). ( 143 - 88 = 155) 

CANT/SUBTRACT 

Ihe student skips the entire problem. (8 - 3 = ) 

COPYn'bP/IN/LAST/COLUMN/IF/BORROWED/FROM 

After borrowing from the last column, the student copies top digit as the ans\yer (80 - 34 = 76). 

DECREMENT/ALL/ON /MtJLTIPLE/ZF.RO ; , 

When borrowing across a 0 and the borrow is caused by 0. the student changes the right 0 to 9 instead of 10. 

(600-142 = 457) > - 

*■<. .,. _ ' , . 

DECREMrvr/BY/ONE/PCUS/ZEROS • * V / ' 

: en there is a borrow across zero, decrements the number to the left of the zero(s) by an extra one fbr every zera 
owed across. (4005 - 6 = 1999) . ^ , •< 

* .<EMENT/BY/TWO/OVnR/TWO 

When borrowing from a column of the form N - 2, the student decrements the N by 2 instead of 1. (83 - 29 =44) 

DECREMENT/LEFTMOST/ZERO/ONLY ' 0 

When borrowing across two or more O's the student changes the leftmost of the row of O's to 9 but changes the other 0*s 
to lO's. He will give answers like: (1003 - 958 = 1055) " 

DECREMENt/MULTIPLE/ZEROS/BY/NUMBER/TO/LEFT ^ 

When borrowing across O's the student changes the leftmost 0 taa 9» changes the next 0 to 8. etc (8002 - 1714 = 6278) 

<i *■ ■ 

DECREMEOT/MULTIPLE/ZEROS/BY/NUMBER/rO/RIGHT. 

When borrowing across O's the student changes the ri^tmost 0 to a 9, changes the next 0 to 8. etc. (8002 - 1714 = 
6188) 

DECREMEOT/ON/HRST/BORROW 

The first column that requires a borrow is decremented before the column subtract is done. 
• (832-265 = 566) , 

ni^CREMENT/ONEyTO/ELEVEN 

Instead of decrementing a 1, the student changes the 1 to an IL (314 * 6 = 2118) 

DECREMENT/TOP/LEQ/IS/EIGHT ^ . ' " 

When borrowing from 0 or 1, changes the 0 or 1 to 8; does not decrement digit to the left of the 0 or L (4013 * 995 = 
, 3778) ^ 

D!FF/0-N=0 

'When the student encounters a column of the fonn 0 * N he doesn't borrow; instead he mtes 0 as the column answer, 
(40-21 = 20) ' 
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DIFF/O-N^^^ student cncounicn t ooliiinn oflhe hm 0 • N. be docwt borrow. Insicid he writes N u the iuiswet. (80 • 
27 = 67) J ' ^ 

DIFF/0-N«N/WI1EN/BORROW/FROM/2ERO ■ . u 

When boTOwingecxoistO and the borrow is caused by tO, the student doc^ Instead he wntcs the bottom 

number as the cohimn answer. He will borrow correctly in the next column or in other drbumstanoes. 
1 0 0 4 0 0 

3 2 - 2 4 8 

— n 1 £ 8 



DIFF/1-N=1 ^ • -V. — ^ • 

When a colunu has the form 1 - N the student wntcs 1 as the colunm answer. (SI - 27 » 31) 

DIFF/N-O=0 ^ 
The student thinks that N • 0 is 0. (57 - 20 » 30) 

* , . ' ■* 

DIFF/N-N=N ^ u 

Whenever there is a coiunw that has the same Aumber on the top and the bocto^n, the student wrtt^ 

answer ^83- U = 73) ^ 

•J ■ ■ . 

DOESNT/BORROW • . . \^ 

^c student stops doing the exercise when a borrow is rcquued. (833 - 262 « 1) 

DONT/DECREMEprr/SECOND/ZERO ^ ^ . . w ^ ^ i 

When borriiwing across a 0 and the borrow is caused by a 0, the studem changes the 0 he IS borrowing 

instead of a 9. (700 - 258 «452) ^ 

D0NT/DECREMENT/2ER0 \. . . 

When borix>wtegacrossaO. ttiestudentchangcsthe0tol0^|nsteadof9* (506- 318 

DONT/DECREMENT/ZERO/OVER/BLANK ^ ' 

Hie student will not borrow across a zero thai is ove^a blank. (305 • 9 s 306) 

IX)NT/DECREM'ENT/ZER0/0VER/2ER0 . ^ ,^ ^ ^ • / 

The student wm not bonowacnitt a lero that if ow a leia (305-107 « 308) . 

DbNT/DEaiEME>rr^:ZERO/UrmL/BCyrrOM/BLANK ^ . u « o.!^ 

' When borrowing aooss a 0. the student dianges the 0 to a 10 instead of a 9 unkss the Ots over a 

does the conea thing. 

'318 : 9 jMl • 

M 5 8 2 5 5 .0* 

Wrong Cormt 

DONT/WRITEi'ZERO 

Doesn't write zeros in the answer. (24 - 14 a 1) 

IX)UBLE>DECREMENTA)NE ' ^ . ■ ^ u . ,1,1 

When borrowing from a 1, the studem treats the las a 0 (changes the 1 to 9 and conumi^ 013- 

515 = 288) > 

FORGET/BORROW/OVER/BLANKS . « 

llie student doesn't dearacnt a number that is over a blank. (347 • 9 « 348) 

■ - . ^ 

IGNORE>TJEFTMOST/OfIE/OVERmANK ' . ^ 

When the left ctdumn of the exerexses hastfl that is over a blank, the student 

ignore/2:ero/over/bianic ^ , ^ * «-*v ^^ 

Whenever there is column that has a 0 over a blank, the snident ignores that octfumn. (907 - 5 = 92) 

INCREME>rr/OVER/LARGER * ^ ^ ^ . , . # 

Wh^ borrowing 6om a column in whki) the top is smaller than the bottom, 

decrementing. ^33 -277 a 576) 

INCREMENT/ZERO/OVER/BUNK . _ 

When bonrowii^ acros a 0 over a blank, the ^dem inemnents the 0 instead of decraiieo^ 

(402-6 « 41(5) 



: ' >.■■ ■ '%i . • ■ . \r- . ' 

N-9 = N-l/Ant:R/B0RR0W ^ 

If a column is of the form N - 9 and has been borrowed from« when the student doamhat column he subtracts 1 instpad 
'» t)fsubtraciing9; (834-7%== 127) ' ^ 6 

N-N/Aini-R/BORROW/CAUSIS/nORROW' ' ' 

IJorrows with columns of the form N-N if the column has been borrowed from. (953 ■ 147 = 7106) 

N-N/CALSr:S/BORROW , . 

Borrows with columns of the form N-N. (953 - 152* = 7101) , ^ 

N;N = l/AfTER/BORRbW 

If a column had the form N - N and was borrowed from, the student writes 1 as the answer to that column. (944 - 348 
" =616) 

N-N=9/PLUS/DECREMENT « . 

When a column has the same number on the top and the bottom the student writes 9 as the answer and decrements the 
next column to the left even thoOgh borrowing is nm necessary. 
(94 - 34 = 59) 

ONCB/BORROW/ALWAYS/BORROW 

Once a student has borrowed he continues to borrow in every remaining column in the exercise. (488 - 229 ? 1159) 

QUIT/WHEN>BOTTOM/BLANK ' ; " 

When the bottom number has fewer digits than the top number, the student quits as soon as the bottom number runs 
out. (439*- 4=^5) 

SlMPLn/PROBLEM/STUTTER/SUBTRACrr x--^^ 

When the bottom number is a single digit and the top number has two or more digits, the student repeatedly subtracts 
the single bottom digit from each digit in the top number. ^ 
(348 - 2 = 116) ' , 

SMALLER/FROM/LARGER . . " ^ 

The student dbcsnH borrow; in each column he subtracts the smaller digit from the laiiger one. 
(81 • 38 57) 

SMALLER/Fk0M/LARGER/INSTEAD/0F/B0RR0W/FR0M/2ER0 , 
The student does not borrow across 0. Instead he will subtraa the smaller from the {arger digit ' . . 

3 0 6 jp^. 5 0 6 ^ 

3 6 2 . 16 2 , , 

SMALLER/FROM/LARGER/WIIEN/BORROWED/FROM 

When there are two borrows in a row the student does the first one correctly but for the second one he does not borrow; 
instead he subtracts the smaller from the larger digit regardless of order. (824 • 157 = 747) 

SMALLER/FROM/LAI^GER/WITH/BORRaW ^ 

When borrowing the student decrements correctly, then subtract! iMlpter 4ligitttM|4h£ larger as if he had not 
borrowed at alL (73-24 = 411) 

STOPS/BORROW/AT/MULTIPLE/ZERO 

Instead of borrowing across several 0*s. the student adds 10 to the column he's doing but doesn't changeany oohunn to 
theleft. (4004 - 9 = 4005) 

STOPS/BORROW/AT/SECOND/ZERO 

When borrowing across several 0*s, changes the right 0 to 9 but not the other oV (4004 - 9 = 4095) ^ ' 

. STOPS/BORROW/AT/ZERO 

instead of borrowing across a 0. the stu()ei^ adds 10 to the column he's doing but doesn't decrement from a cohunn to 
the lea (404-187 =227) - 

STUTTER/SUBTRACT 

When there are blanks in the bottom number, the student subtracts the leftmost digit of the bottomnumber in eyeiy 
column that has a blank. (4369 - 22 - 2147) 

SUB/BOTTOM/FROM/TOP 

The student always subtracts the top digit from the bottom digit If the bottom digit is smaller, he decrements the top 
digh and adds 10 to the bottom before subtracting. If the bottom digit is zero, however, he^jyrites the tc^ digit in the 
answer. If the tiq> digit is 1 greater than the bottom he writes 9. He will give answers like thii (4723 - 3065 = 9742) 
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SUn/COPY/LEASl/nOlTOM/MOSt/rOP , '' '\- j' . «„: u ^ If LwiK* 

. TTic student docs noisubiraci. Iimwd he copte digits from the Acrcisc to fill in iIk ^/^o * 

IcfUnostUigil from the lop number and the other digits from the bottom number, lie wUl give answers bke this. (64g • 
231 » 631) 

SUB/ONE/OYER/BLANKS . . , r . ^-^ 

When there arc blanks in the bottom number, ihc student subtracts I from the top oigiL 
(548-2 = 436) 

Ina(>-Ncolumn,mesiudcni4ocsa'tborTt>w;insicadhetrcatsthc0asirilwcrea9. (30-4 « 39) 

TREAT/TOP/ZERO/AS/TEN ' \ . • .w^i^iv 

In a (V^Ncohjmn, the student adds 10 to it correcUy but doesn't d^ngc any colu^ (4a- 27 = 23) 

X-N=:0/AFTER/BORRbW ^ ^ _^ 

If a column has been borrowed from, students wntes zero as its answ. (234 • m 

X-NsN/AFTER/DORROW . ' ^ .. . . . iic< i<a\ 

If a column has been borrowed from, students w, to the bottom 

ZERO/aTFTER/BORROW . . n w / 

Whw a cohmin requires abonow, the student decrements correctly but wntesO as me uiswer. 

(65-48 * 10) 

ZERO/INSTEAD/OF/BORROW/FROM/ZERO ^ ^ ^_ ^ . . „*u^ 

■The studeni won't borrow if he has to borrow across 0. Instead he wlU write 0 as the answer to the cohmm recpnnng the 

bonow. > 
7 0 2.702 

8 - 3 4 8 . . ' * . 

7 fl 0 i 6 0 V \ 



ZERO/INSTEADOF/BORROW » ,^ 

The student doesn't bonow; he writes 0 as the answer instead. (42-16 * 30) 
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Appcndix2 
AIIDiagiflres 



The diagnoses of all tests -of all students analyzed by DEDUGGY fall into the following categories^ 

Correct 112 (10%) ^ 

SliRS ' 223 (20%) ^ , ' 

Buggy 417 , (37%) 

Undiagnosed 386 (34%) 

total 1138 . (100%) 

llic diagnoses the students th^t were analyzed as having bu^ are shown ordered by^ their 
frequency of occi\rrence. Diagnoses consisting of more than one bug are sliown in parentheses. 
There ape 157 distinct diagnoses, of which only 50 occurred more than once. However, these 50 
diagnoses account for 310 of the 417 cases (74%). 



llie names used for bugs in this appendix and all others are somewhiat differti in form than 
those used in the test. Smaller-From- Larger is written here as •smaller/from/larger. Also, the 
diagnoses in the appendices sometimes contain coercions, A coercions is a modifier that is 
included in a diagnosis to improve the fit of the bugs to th^ sludeafs ^Bk^ Most often, these 
slightly perturb the definitions of bugs. For example, ceruun bugs chodify'lp procedure so that 
on occasion it will write column answers that are grc*ater than 9. Some students who have these 
bugs apparently know from addition jthat ther^ shbuld only be bne answer digit per column, so 
they only write the units digit To capture this, the coercion iwrite/units/oigit/only is added to 
the diagnoses of such students by DEBUGGY. Coercions can easily be picked out in the appendices 
because , they have exclamation points in their names. For more on coercions, see "(Burton, 1981). 

106 occurrences; , \ > 

•SMALLER/FROM/LARGER - . 

34 occurrences: 
•STOPS/BORROW/AT/ZERO 

18 Occurrences: 

•BORROW/ACROSS/ZERO " . J ' - 

12 occurrences; 
*B0RR0W/NO/DECREMENT 

11 occurrences; ^ , ! 

•t )RR0W/FROM/ZERO 

' ■. . ^ ■ ■ . . ■ ' 

9 occurrences; , , 

( •STOPS/BORROW/AT/ZERO "DIFF/O-MfM) 
, ("BORROW/ACROSS/ZERO •piFF/0-M«M) 

7 occurrences; . » 

(•BORROW/ACROSS/ZERO/OVER/ZERO •BORROW/ACROSS/ZERO/OVER/BLANK) 
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•BORROW/DON 'T/OECREHENT/UML£SS/BOTTOM/SMALLER 

• • . ■ 

6 occurrtnceS; 

<'*06nRbW/N6/b£CRENENT •OIFF/0-H»M) 
•BORROW/NO/DECREMENT/EXCEPT/LAST 
• ALWAYS /BOR R OW/ L E FT 



S occurf—,^,. 
( •SMALLefyrnfiH/L ARjSER •IGMORE/LEFTMOST/ONE/OVER/BLAMI^ 
•BORROW/DlFf/0-N«M&SMAU<Hjl86E«0 ' ' M 

(•QUIT/WHEN/BOTTOM/BLAMK •SMALLER/FROM/LARGER) - ^ 



4 accurrencea; ' y , 

T^o5rr3S7aCR?55s/zero •diff/o-n»o) 

•oon't/oecrement/zero 

(•STOPS/BORROW/ AT/ZERO •BORROW/ONCE/THEN/SMALIlER/FROM/LARGER .m- 

•DIFF/0-M*N) ^ , ^ 

•STOPS/BORROW/AT/MULTIPLE/ZERO > ^ 
(•BORROW/INTO/ONE-TEN •STOPS/BORROW/AT/ZERO) 

3 occurrtnc>»; > . ^ 

•OeCfl£MeUT>M0LTIPLE/ZEROS/BY/i^kR/TO/RI6lt 4 
•O-.M-M/EXCifT/AFTER/BOHiW ^ , 

(•STOP^mitOW/AT/ZERO *BpRROW/ONCE/THEN/SM^^ ^ C 

2.occurr>weis; « * ^ ^ 

(A6irf>N-6»6 •SMALLM/FROiin^GER •DIP'^ ^^-M^O) . ^ 



(*6irf>N-6»6 •SMALLU/FROH/iJRGC 
•SIWie4MpBLEM/STM|R/$IM 
•OEAEMEf&ALL/ON/MlVlPLmE^ 



(•STd#S/B0lW0W/AT/ZER0 •OIFF/O-M-0) 
•OECREMEHT/LEFTMOST/ZERO/ONLY 

(•OORROW/ACROSS/ZERO •tf-N-M/EXCEPT/AFTER/BORROW) 

•OI/F/0-N*N r ^ 

•QUIT/WHEH/BOTTOM/BLABK 

•STUTTER/SUBTRACT ^ ' 

(•BORROW/ONLY/FROM/TOP/SMALLER •BORROVf/ACROSS/ZERO/QVER/ZERO 

•BORROW/ACROSS/ZERO/OVER/BLANK) . 
•OECREMENT/MULTIPLE/ZEROS/BY/NUMBER/TO/LErr 

( •SMALLER/FROM/LARGER/ IMSTEAO/OF/BORROW/FROM/ZERO •BORROW/OBCE/ THEN /SMALLER/ FROM/ LARGER 

•OIFF/0-II*N) 
•0-N-O/AFTER/BORRW 

(•STUTTER/SUBTRACT •BORROW/ ACROSS/ZERO/OVER/ZERO) ^ 
(•BORROW/ACROSS/ ZERO •BORROW/DON 'T/DECREMENT/TOP/SMALLER) 

•BORROW/ACROSS/TOP/SMALLER/OECREIiENTIIiG/TO ' 
•BORROW/DON 'T/DECRBMENT/TOP/SMALLER . 

■'•,.."* ^ . 

1 occurr«ncg>! \ 
^55T5P57§?M5W/AT/Z|R0 •l-l-l/^FTER/BORROW) 

•BORROW/ACROSS/ ZERO/OVER/ZERO :^0-N«N/EXCEPT/ AFTER/BORROW 
•1-1-0/AFTER/BORROW) . 

•O-N-N/EXCEPT/AFTER/BORROW •BORROW/SKIP/EQUAL) \ 

IGNORE/LEFTMOST /OME/OVER/BLANIC V I 

•STOPS/BORROW/AT/ZERO •SMALLER/FRON/LARGER/WHEII/BORROWED/FROM) 

•BORROW/NO/DECREMEUT •OIFF/0-II*0) ^ 

•BORROW/ACROSS/ZERO •DIFf /0-II«M/WHEM/BORROW/FROM72ERO •BORROW/ACROSS/ZERO/OVER/ZERO) 
•SMALLER/FROM/LARGER/INSTEAO/OF/BORROW/FROM/ZERO •DIFF/0-ll«ll 

•SMALLER/FROM/LARGER/WHEN/BORROWED/FRON) 
•O-N-N/EXCEPT/AFTER/BORROW •1-1«1/AFTER/B0RR0W) 

•BORROW/ONCE/WITHOUT/RECURSE •DON'T/DECREMENT/ZERO/UtlTIb/BOTTOM/BLANlt) 
•BORROW/NO/DECREMENT •0-M»ll/EXCEPT/ AFTER/BORROW) 
•BORROW/ ACROSS/ ZERO •QUIT/WHEN/BOTTOM/BLANK •DIFF/O-N'O) 
•BORROW/AtROSS/ZERO •FORGET/BORROWTOV^R/BLANKS *DIFF/O^N*H) 

•DON'T/WRITE/ZERO •BORROW/IGNORE/ZERO/CVER/BLANK) ; ' 

•BORROW/ ACROSS/ ZERO •0-II*0/EXCEPT/AFTER/BORROW •1-1*1/AFTER/B0RR0W) 
•BORROW/ACROSS/ZERO *0-II*0/EXCEPT/AFTER/BORROW •1-1-0/AFTER/BORROW) 
•STOPS/BORROW/AT/ZERO •DIFF/O-H-II •POU'T/WRITE/ZERO) 
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(•BORROW/ ACROSS/ZERO •SIMPLC/PROBLEM/STUTTER/SUBTRACT) 
•AOO/INSTUDOF/SUB 

,(*0-N6N/ArTFR/dOnROW •N-N»1/AFTER/B0RR0W 

•SMALLER/FROM/LARGCR/lNSTEAO/Or/nORROW/rROM/ZERO) . 
(•BORROW/ I NTO/ONE«rEN •DCCREMENl/MULT IPLE/ZEROS/BY/NUMBER/TO/LEFT) 
(•BORROW/ACROSS/ZERO •BORROW/SKIP/EQUAL) 
•FORGFT/BOKROW/OVER/BLANKS 

(•0-N=N/tXCEPT/ArTER/BORROW •1-1=0/AFTER/B0RR0W) 
•1-1=0/AFTER/BORROW 

(•BORROW/ACROSS/ZERO •BORROW/ONCE/THEN/SMALLER/FROH/LARGER ) 
(•SMALLEn/FROM/LARGER •OIFF/N-N=N •OIFF/0-N«0) 

(•BORROW/FROM/ONE/IS/TEN •OECREMENT/LEFTMOST/ZERO/ONLY •BORROWEO/FROM/OON* T/BORROW) 
• (•ST0PS/B0RR0W/AT/ZER6 •0-N=0/EXCEPT/AFTER/BORROW 
'.V •1-1«0/AFTER/BORROW) 

•ADO/LR/OECREMENT/ANSWER/CARRY/TO/RIGHT 

(•STOPS/BORROW/AT/ZERO •0-N«0/AFTER/BORROW) 

(•TREAT/TOP/ZERO/AS/TEN •0-N-N/EXCEPT/AFTER/BORROW /BORROW/ ACROSS/ZERO/OVER/BLANK) 
. •IGNORE/ZERO/OVER/BLANK 

' " •DECREMENT/TOP/LEQ/IS/EIGHT . 

(•BORROW/OON'T/OEtREMENT/UJILESS/BOTTOM/SMALLER •OON'T/WRITE/ZERO) 

( •BORROW/ ACROSS/'TOP/SMALLETt/OECREMENTIMu/TO •BORROW/ONCE/THEN/SMALLER/FROH/LARGER) 

•OON'T/OECREMENT/ZERO/OVER/BLANK 

(•OIFF/0-N'N •OIFF/N-O-O) ^, > ' 

( lONLY/WRlTE/UNITS/OIGIT •JORROW/ACROSS/ZERO •BORROW/OON' T/DECREMENT/TOP/SMALLER 

•N-N/AFTEB/BORROW/CAOSES/BORROW) 
(•STOPS/BORROW/ AT/ZERO •BORROW/OON 'T/OECREMENT/TOP/SMALLER 

•OIFF/0-N-N) 

(•OECREMENT/ALL/ON/MULTIPLE/ZERO •OOUBLE/OECREMENT/ONE) 
(•BORROW/UNIT/OIFF •SMALLER/FROM/LARGER/INSTEAO/OF/BORROW/FROH/ZERO 

•0-NaN/EXCEPT/AFTER/BORROW) ; , ' 

•BORROW/ONCE/THEN/SMALLER/FROM/LARGER . 

(•BORBOW/FROM/ZERO •0-N = 0/AFTER/BORjROW) / 
•SUB/COPY/LEAST/BOTTOM/MOST/TOP , » 

(•N-N/AFTER/BORROW/CAUS^S/BORROW •O-NiO/EXCEPT/AFTEfl'/BORROtf ) 
•N-9-N-VAFTER/BORROW » 
- ( ^SUB/UNITS/SPECIAL •BORROW/ACROSS/ZERO •SMALLER^OM/LARGER) ^ 
(•OECREMENt/LEFTMOST/ZERO/ONLY •BORROWEO/FROM/OOlPr/BORROW) 
(•BORROW/FROM/ZERO •0-N»|J/AFTER/BORROW) 
(•OlFF/N-0-0 •STOPS/BORROW/AT/ZERO) 

(•0]/rF/0-N«N/WHEN/BORROW/FROM/ZERO •STOPS/BORROW/AT/ZERO » 

•BORROW/OON'T/DECREMENT/TOP/SMALLER) * . 

(•BORROW/FROM/BOTTOM/INSTEABOF/ZERO' •OIFF/0-N»N) 
(•BORROW/FROM/ONE/IS/TEM. •BORROl^FROM/ZERO/IS/TEN) 
(•SMALLER/FROM/LARGER •()-N»0/EXCEPT/AFTER/BORROW) 
; (•BORROW/TREAt/ONE/AS/ZERO •N-N«l/AFTE4i/B0RR0W •OON'T/DECREMENT^/ZEROyOVER/BlANK) 
•SUp/BOTTOM/FROM/TOP v . . 

(*borrow/no/6ecrement •oecrement/top/leq/is/eight •x-n»n/after/borrow 
/ . •borrow/once/then/smaller/from/larger) 
^ •oouble/oecrement/one 

- (•B0RR0W/ACR05S/,ZER0 •DlFF/0«i^»lf/WHEN/aORROV^/-FROM/ZERO 
•SMALLER/FROM/LARGEB/WHEN/BORROWEO/FROM) 
(•BORROW/ ACROSS/ZERO •X-N«0/AFTER/BORROW) 
( fONLY/WRlTt/UNITS/OIGIT *STOPS/BORROW/AT/MULTIPLE/ZErtO , 

' •N-N/AFTER/BOi^ROW/CAUSES/BORROW) . 
(•BORROW/FROM/ONE/IS/NINE •BORROW/FROM/ZERO *00N' T/OECREMENT/ZERO/OVER/BLANK) 
- -/•BORROW/FROM/ZERO&LEFT/TEN/OK •0-N«N/AFTER/B0RR0W) 

-fBORHOW/ACROSS/ZERO>OVER/BLANK ' . 

/borrow/ across/zero/oveb/zero V 

/•BORROW/FROM/ALL/ZERO ^ ' ' ' \ ' • ' ' " 

|( !ONLY/WRITE/uNITS/OIGIT •N-N/AFTEi/BORROW/CAUSElS/BORRO)*) 
•N-N/CAUSES/BORROlSf 

(•BORrtOW/DON'T/OECREMENT/UNLESS/BOTTOM/SMALLEp •X-N»0/AFTER/BORROW) 
(lONLY/WRITE/UNITS/OIGIT ^SIMPLE/PROBLEM/STUT.TER/SUBTRACT 

•NrN/AFTER/BORROW/CAUS^S/BORROW) 
(•B0RR0W/N0/6eCREMENT •SUB/ONE/OVER/.BLANK) 

(•OON'T/OECREMENT/ZERO/UNTIL/BOTTOM/BLANK •BORROW/ACROSS/ZERO/OVER/ZERO) 

, . . . ■ ^- ' •»» . ' 
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(•D0N'T/DECREMEIIT7ZERa •OECREMENT/ONE/TO/ELEVEN) 

(•FOnGET/BORROW/OVEH/BLANKS •BORROW/DON'T/DECREMENT/TOP/SMALLER •BORROW/SKIP/EQUAL) 
(•DIFF/O-N-II/WHEM/BORROW/FROM/ZERO •OOM'T/DECREMEMT/ZERO) 
(•SMALLER/FROM/LARGER •O-M-N/EXCEPT/AFTER/BORROW •M-M/CAUSES/BORRW) 
•BORROW/ONLY/FROM/TOP/SMALLER \ * 

(•BORROW/ONLY/FROM/TOP /SMALLER •0-N»N/AFTER/BORRW) V 
(•BORROW/ACROSS/ZERO "BORROW/ONLY/ FROM/TOP/SMALLER). 

(•0-M-M/AFTER/BORROW •BORROW/ ACROSS/ZERO/OVER/ZERO •BORROW/ACRQSS/ZERO/OVER/BLANK) 
(•FORGET/BORROtf/OVER/BLANKS •STOPS/BORROW/ AT/ZERO •BORROW/ONCE/THEM/SHALLER/FROM/LARGER) 
(•BORROW/INTO/dNE-TEN •DECREMENT/MULTIPLE/ZEROS/BY/MUMBER/TO/RIGHT 

•BORROW/ACROSS/ZERO/OVER/ZERO) 
•ZERO/INSTEAOOF/BORROW 

•BORROW/FRO>«/ZERO/IS/TEM / * 

•BORROW/DECREMENTING/TO/BY/EXTRAS \ 
•AlDD/MOCARR'V/INSTEAOOF/SUB 

•0-N-N/AFTER/BORROW ^ . 

(•BORROW/ACROSS/ZERO/OVER/Z€RO •SMALLER/FROM/LARGER/WUtN/BORROWED/FROH ♦ . ^ 

•BORROW/ACROSS/ZEfiP/OVER/BLANIC) 
(•SMALLER/FROM/LARGER •DIFF/0*N»0) 
(•BORROW/ ACROSS/ZERO 'DIFF/N-^NvN) 
(•FORGET/BORROW/OVER/BLAMKS •STOPS/BORROW/ AT/ZERO 

•0-N-N/EXCEPT/AFTER/BORROW •1-1-1/AFTER/BORROW) 



(•BORROW/FROM/OME/IS/NINE 'BORROW/ FROM/ZERO •DIFF/O-M-M) 
•CAN'T/SUBTRACT ' 
( •StMPL^/PROBLEM/STUTTER/SUBTRACT^ •BORRO«/Af ROSS/ZERO/OVER/ZERO) 

nONLY/WRITE/UMITS/DIG •BORROW/OMLY/FROM/TOP/SMALLER •OECREMEMT/ALL/Oir/MULTIPLE/ZEftO 



•N-N/AFTER/B0RR0W/CAUSES/80RR0W) 
(•0-M»N/EXCEPT/AFTE«/BORROW •N-N»«/PLUS/DECREHEMT) 

(•BORROW/ACROSS/ZERO •IGMORE/LEFTMOST/OME/OVER/BLAMK •BORROW/SKIP/EWAL) 
(•FORGET/BORROW/OVER/BLANKS •STOPS/BORROW/AT/ZERO) 4 
(•BUMK/INSTEAOOF/BORROW •DIFF/O-M-N) 
(IWRITE/LEFT/TEir 'SMALLER/FROM/LARGER) 

(•DbUBLE/DECREHEMT/OIIE •SMALLER/FROM/LARGER/WHEM/BORROWCD/FROM) 
•1-1-1/AFTER/BORROW , 

(•BORROW/Na/DECREMEHT/EXCEPT/LAST •O-M-H/EXCEPT/MTER/BORROW) . 
( •BORROW/ ACROSS/ZERO •SUB/ONE/OVER/BLANK ^0-N»M/ AFTER/BORROW ' 

•0-N*N/EXCePT/AFTER/BORRO*) . 
{•00N'T/DECREMEIIT/2ER0 •1-1»0/AFTER/B0RR0W) 



% Appendix 3 - 

Bug Occurrence Frequencies 



The number of times each bug in the database CKCurred is shown. The first two columits show 
the number of times the given bug occuixcd in debuggy's diagnoses. The first column, labelled 
"alone" is the number of times the bug occurred alone, as the only element of^'the^ diagnosis. The 
second Cdlumn, labeljed I'dmd." is the number of times the bug occurred as part of a rfiulti-bug 
diag^osis, or "compound" bug as it was called in (6rowp & Burton, 1978). llius, for example, the 
bug •i-i-i/AFTER/BORROW occurred once alone, and five times as part of a larger diagnosis. The 
third column indicates whigh bugs were added to the data base since the Southbay study'began. It 
is interesting that $ome of these ne^ bugs are not at all rare. These ,data include all tests of all 
subjects in. both the Southbay an^sKort-tenn study. As always, the data come ftx)m the reanalysis 
that, was performed after all new bugVhad been entered in the database. Rows it^t would be all 
zero have been left blank to highlight those bug? in the data base- which never occurred in these 
studies. Of the 104 bugs in tlje data base, 77 bugs occurred at least once. 

alone cmd. new? Bug * 

; 2 2 •0-M-O/AFTER/BORROW / ' * 

0 6 new ^•0-N-O/EXCEPT/AFTER/BORROW 

° •! 6 new •0-M-«/AFTER/BORROW V ^ 

3 14 new •0-M-M/EXCEPT/AFTER/BORROW' ^ ' 

1 6 •1-1-0/AFTER/BORROW ' : 
1 4 •l-l-l/AFTERVBORROW . ' 

•ADD/BORROW/CARRY/SUB ^ 
•ADD/BORROW/DECRENiENT ^ , 

•ADQ/BORROW/DECREMEMT/WITHOUT/CARRY 
10 •ADD/IMSTEADOF'/SUB " / 

1 0 new •add/l9/decrememt/amswer/carry/to/right 

1 0 •add/nocarry/imsteadof/sub ' * 

•always/borrow ; 
6 0 •always/borrow/left ^ • 

0 1 new •BLAMK/IMSTEADOF/BORROW ^ , ' 

2 1 new •B0RR0W/ACR0SS/T0P/SMALLER/DECREMENTIM6/T0 

IB 33 , •BORROW/ACROSS/ZERO ^ W 

1 12 ' •BORROW/ACROSS/ZE^O/OVEI^/BLAtlK 

1 18 new •BORROW/ ACROSS>ZER(>/OVXR/ZERO . * * ' • ' 

•borr6w/add/decrement/|mst^adof/zero 

•BORROW/ADD/iS/TEN 

1 0 new •B0RR0W/DECREMENTIM6/TpJ^BY/EXTR/\g 

6 0 •BORROW/DIFF/0*N-N&SMALL-LAR6E«<0, 

2 6 / •BORROW/DON'T/OECREMEMT/TOP/SMALLER ' 

7 2 •BORROW/DON •T/DECREMENT/UMLESS/BaXTOM/SMALLER 

10 . •BORROW/FROM/ALL/ZERO ^ 

•BORROW/FROM/BOTTOM . * 

0 1. •BORROW/FROM/BOTTOM/INSTEADOI/ZERO * 

•BORROW/ FROM/ LARGER. ; * 

0 2 •BORROW/FROM/OME/IS/NIHE ^ 




0. 2 y •BORROII/FROM/ONE/IS/TEII 

11 '4. ^ /•BORROW/FROM/ZERO . 

0 1 ^ •BORHOW/FROM/ZERO&LEFT/TEN/OIC • 

1 1 n0w' •BORROW/FROM/ZERO/IS/TEN 

0 1 •BORROWyiGNORE/ZERO/pVER/BLANK 

0 6 •BORROW/INTO/ONE*T^N 

12 10 •BORROV/NO/OECREMENT 

6 1 new •BORROV/NO/OECREMENT/EXCEPT/LAST 

1 13^ new •'BORROW/ONCE/THEN/SKALLER/FROM/IARGER 

0 Ir / •BORROW/OIICE/WITHOUT/RrCURSE 

1 S •BORROW/OtlLyFROM/TOP/SNALLER 

•BORROV/ONLY/ONCE 
•BORROW/SKIP/EQUAL \, 
•B0RR0W/TEN/PLUS/NEXT/0I6IT/INT0/ZER0 ^ " 
•aORROW/TREAT/ONE/AS/ZERO 
•BORROW>U»IT/DIFF , , ^ 
•BORROW/WONT/RECURSE, 
•BOR ROWED/ F RON/DON ' T/OORROW 
•CAHl'T/SUBTRACT 

•copy/top/in/last/column/if/borrowed/from 
•decrement/all/on/multiple/zero 
•oecrement/by/one/pluS/zer6s 
•oecrement/by/two/over/two 
•oecrement/le^most/zero/only 
•0ecremeiit/multiple/zer0s/by/num8er/t0/lcft 
•decrement /multiple/zeros/by/number/to/right 
*oecrement/0|l/first/borroii 

rOECREMENT/ONE/TO/EtEVEII 
•0ECREMENT/T0P/LEQ/IS/EI6HT 
•5lfF/b-N"P ^ ^ 

•DIFF/0-N•l^ 

•o;Ff:/0-N>N/wilEN/BORAOW/FRO»«/ZERO 
•OIFF/i-N«l ' , ^ 

•OIFF/M-0*0 ' :r 

)*OIFF/N-N«N 

•doesnt/6orrow 
•don ' t/oecremcnt/secono/zero 
•don ' t/oecrement/zero 
•don ' t/oecrement/zero/over/blamk 
•don ' t/oecrement/zero/over/zero 
•oon't/oecrtment/zero/untiubottom/blank 
*oon;t/write/zero 
•double/t^crement^one 
•forget/borrow/over/blanks 
•ignore/lefthost/one/oveft/blanic 

•iGNORE/ZERO/pVER/BLANK 

•ihcrement/ovIr/larger . 

•fNCREMENf/Z|RO/OVER/BUNK ^ 
•jr-9*N-l/AFTER/B0RR0W 
•N-^/ArTER/BORROV/CAUSCS/MRROW 
•N-lN>CAUSES/BORRDtf 

•N-N- 1/AFTER>80RRM \ 
•N-N*9/PLU$/DECREHENT 
•OMGE /BORROW/ ALWAYS/BORROW 
•QMIT/WHEN/BOTTOM/BLANIC 
•SIMPLE/PflOBLEM/STUTTER/SUBTRACT 
106 18 •SMALLER/FRON/LARGER 
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•SMAUeN/FROM/LAnGER/WlltN/BORROWED/FROt^ 
•SMALLER/FROH/LARCER/WITll/&OjftRQtf ' 
•STOPS/bORROWVAT/toLTIPtE/ZERfO 
•STOPS/BORpOW/AT/SlBCOMDj/ZERO, / 
•ST0PSVB0RR0W/Af/2fER0-^ r . 

•STUTTER/SUBTRACT V 



•SUByCOPY/LEAST/POTTOM/MOST/TO^ 
•SUB/ONE/OVER/BLANIC > 



•TREAT/TOP/ZERO/AS/TEH *f 

•X-NpO^AFTER/BORROW 

•X-N»N/AFTER/BORRpW ^i' - . 

•ZERO/AFTER/BORROW 

•ZERO/INSUAp/OF/BORRbw/FROMVZERO 

yZERO/INST^AOiOF/BbRROW 



' Appendix4 
DEBUGGY Yersu^ the Ejcperts 



v^Seveial^f difTerent 'w^^ comparing DEBUGGYts diagnosis widi the cxpem* ' diagnoses* 
presented here. First, the results from the Soutbbay study are presented. They show excellent 
agreement regarding which bugs a Buggy student has» but differ a little bit on whether a student 
should be placed in the Buggy cat^ory or the Undiagnosed category. . ^ 



; Bcperts' Diagnose 



DEBUCCTs Diagnosis 

Syps ' 
Buggy^. 
Undiagnosed, 

totals 



Slips 

148 (95X) 
3 ( IX) 
18 ( 8X) 

169 



Buggy 

4 ( 3X) 
233 (79X) 
30 (13X) 

287 ' 



Undiagnosed . toii^ v 



3 ( 2%) 
59 (20%) 
188 (80X) 

250- 



156 (lpO%) 
296 (lOQX) 
236 (ibOX) 

686 




Tabl« 4.1:V A' comiiarison of debugOY's ^'*g!'""» witii dw'expeip;' diagnosis by diagniosdc 
category. The second rowt for e^ple. shows that of the 29S subjects ±at were analyzed as 
Buggy by DEBUGGY, 1% were analyzed as having' ^s by the expert, 7<^ as Buggy, and 20% as 
Undiagnc»ed. Only the firs^ tests of those subjects who were tested twice are counted in this table.' 



* '. J 
Equal 

Expert removed a bug 
Qcpeftaddedabug 
Overiap 
' Otherwise 
total 



193 (83X) 
13 (.6X) 
4 ( 2%) 
10 ( 4X) 
13 (8%) 

233 (lOOX) 



Table 42: /A comparison of DEBUi^rs (SagnoSts with the expert's diagnosis by bug. Of die 233 
subjects that both DEBUGGY and the expert agrefe were Buggy, 193 were given :exactly the same 
dii^osis. In all but U cases, (Le. 94% of the time) there was substantial agreement between the 
experts, and DEBudQY. ,Only the first tests of diose subjects who were tested jtwi^ are counted hi 
this tabic . ' i* 



In the following fojyr tables^ a > thrcc-way comparison of pxpcrt and pEnuGGY diagnoses, is 
prcsontad.' Two experts, namely the authors, ^nalyzcd the SHort-tcnm data, as did ui-buggy. The 
fimt iliree tables compare theiV judgmcats by diagnoistic qategbry. 'I1ie first coniparcs the experts 
to cvjeh other, and the next two compare DFBUGGY to each expert individually. It can be seen thai 
tiie experts agreed more with , di:kuggy than with each other. 



Table 4.3 

Friend 

Slips 
Bugs 

Undiagndised 
totals 



Slips 

64 (90%) 
1 ( 3%) 
6 (23%) 

JBl 



VanLchn 
Bugs ■ Undiagnosed 



1 ( 2V) 
23 (68%) 
10 (38%) 

24 



5 ( 8%) 

10 (3^) 

10 (38%) 

25 



totals 

60 (lOOX) 
34 (100%) 
26 (100%) 

120 ♦ 



Tablc4.4 

DEBUGGY 

Slips. 
Bugs 

Undiagnosed 
totals 



Slips 



40 
1 
20 

61 



(98%) 
( 3%) 
(45%) 



VanLehn 



Bugs 



0 
28 
6 

34 



( 0%) 
(8p%) 
( 14%) 



Undiagnosed totals 



6 (i7?0 
18 (41%) 

25 



41 (100%) 
35. (100%) 
44 (100%) 

120 



Table 4.5 
DEBUGGY 

Slips 
Bugs 

Undiagnosed 
totals 



Slips 

■41 (98%) 
1 ( 3%) 
21 (42%) 

63 : / 



Friend 
Bugs 
0 ( 0%) 

26. (7m 

9 (18%) 
37 



Undiagnosed 

1 ( 2%) 
8 (22%) 
20 (40%) 

29 



totals 

42 (100%) 
37 (100%) 
50 (100%) 

129 



The following table compares the experts' diagnoses and debuggy's by comparing the sets oftu^ 
they produced for the cases where both put the student in the Buggy category. 



Table 4.6 
Equal , . 

One bug-set is a subset of the other 

Overlap 

Otherwise 

total 4^ 



VVS.F 



(39%) 
(22%) 
(13%) 
(26%) 



VVS.D 



15 
6 
5 
2 



(54%) 
(21%) 

( ^%) 



23 (100%) 28 (-100%) 



FVS.D 



20 
2 
2 
4 



(71%) 
( 7%) 

( n) 



28 (l&p%J 



52 



Appendix 5 . 
Sbort-teilb Stability 



the short-term subflity results are presented. The tesa ««ie givea two days apart, using'the same 
or very similar tests. The tests were analyzed by DHJUGOY and by an experL 'pEBUGOX's analyses 
will be presented fiisL 



Second Test 



Firsttcst 


Correct 


Correct 


3 (43%) 


Slips 


4 (20X) 


Buggy 


0 ( 0%) 


Undiagnosed 


0 ( 0%) 


totals 


"7 



Slips 



3 (43X) 
11 (65%) 
(12X) 



2 
5 



21 



Buggy 

0 ( OX) 
0 ( OX) 

12 e.7«) 

6 (26%) 
18 



Undiiagnosed 

1,(14X) 
5 (26X) 
3 (18%) 
12 (52X) 

21 



totals 

7 (lOOX) 
20 (lOOX) 
17 (lOOX) 
23 (lOOX) 



87 



I Table 5.1: Short-term stability by diagnostic category. Hie above table show honf the students 
Vchanged amoM- diagnostic classes ataxKS the two tests. Hie figures in parentheses show the 
'^^r^on o^e first test's category that the given cell of. the uble represents. For exmaple. of 
' the 17 studeni^ who were in the Buggy category on the first test, 0% were in die Correet cat^ory 
on the second t«t, 12% were in the Slips category, 71% remained in the Buggy category, and 18% 



could not be diagnosed on the second test 



The switching between the Correct and Slips cat^iy was expected dnce slips are assumed to be a 
labile, "peiformance" phencmiena. The switching among the, Undiagnos(Kl and Slips categories is 
probably due to students who should have been- place in Slips instead of Undiagnosed, but they, 
made so many slips that they exceeded debuggVs 90%-cortect threshold forf the Slips category. 
What's unexplained is the movement into and out of the Buggy cat^oty. This movement is 
examined more closely in. the next table. , ^ 
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Tabic 5.2: Shorl-icrm stabiliiy by bug scL This tabic shows dhbuggY's diagnoses of the 12 
subjects who were Buggy onvboth iqsti Two of these had the same diagnoses ]^th^ times. The 
other ten subjects' diagnoses Ijad bugs appe^^riiig and jdisappearihg, indicating instability. Cases of 
bug migration arc markQd witli ®. ' • ? ; 

Diagnoses are equal: • 

O , (•D0RR0W/ACR0SS/ZER0 »0IFF/5-N«0). becomes *^ ' 

(•BORROW/AGROSS/ZfRO •OIFF/O-N-0) ' 
b (•OlFF/N-a»0 •SMALLER/FROM/LARGER •DIFF/O-M-0) t)eco»»r , 

(•DIFF/N-0»0 •SMALLER/FROM/LARGER •DIFF/0-M«0) • , 

Diagnoses overlap: > ' 

O 
O 



b 

® 
O 
O 



O-M-M/EXCEPT/AFTER/BORROW) becomes - . 

0-M=M/^EXCEPT/AFTER/BORROW •1-1«0/AFTER/B0RR0W>'' - 
BORROW/ACROSS/ZERO •SIMPLE/PROBLEM/STUTTER/SUBTRACT) becomes 
BORROW/ACROSS/ZERO) 

ST6pS/B0RR0W/AT^ER0 •0IFF/0-M»NJ becomes 

STOPS/BORROW/AT /ZBRO •0IFF/0-M»N; •^MALLER/FROM/LARGER/WHEM/BORROWEO/FROM) 
STOPS/BORROW/AT/ZERO •0-M«0/EXCEPt/AFTER/ZERO .•1-1-1/AFTER/BORROW) becomes 
STOPS/BORROW/AT/ZERO •0-N«D/EXCEPT/AFTER/ZERO •1-1-0/AFTER/BORROW) 
STOPS/BORROW/AT/ZERO •0-N«0/EXCEPT/AFTER/ZERO *^-l»0/AFTER/BORROW) becomes 
STOPS/BORROWAAT/ZERO •Olf F/0-H»0) ^ ' 

DORROW/ACROSS/ZERO/OVER/ZERb •0-M»N/EXCEPT/AFT£R>^ZER0 •1-1»0/AFT£R/B0RR0W) becomes 
0-M«M/EXCEPT/AFTER/BORROW •BORROW/SKIP/EQUAL*) 



Diagnoses dcuiot overlap: • . 

O (•OIFF/N-0«0 •SMALLER/FROM/LARGER •OIFF/O-NiQ) becomes 

(•BORROW/ACROSS/ZERO) , ^ 

O (•SMALLER/FROM/LARGER) becomes 

(•STOPS/BORROW/AT/ZERO •OIFF/0-N-N) . ' ' , 

O (•BORROW/ONCE/WITHOUT/RECURSE •DONT/OECREMEIIT/ZERO/UHTIL/BOTTOM/BLANK) (J»com»« 

( •BORROW/MO/OECREMENT) 
® (•BCIrR0W/ACR0SS7ZER0 •0IFF/0-N«0) becomes • ' ^ 

(•STOPS/BDRROW/AT/ZERO) ^ * 



TablfigJ: Short-term stabitf^ by expert's diagnostic categpri^r The expert was able to uncover 
(ise^f tinkering by comparing thjC answers of the test items across tcjts. Since, each item was 
matched to a corresponding item on the other test, it was easier to come to a decision about which 
enois were due to. slips and which were due to tinkering or bugs. The following table presents 
this Repair Theoretic analysis in the same fonhat as table 5.1. 

Second Test • 



First Test Correct Slips 



Correct 
Slips V ^ 
Buggy 
Tinkering • 
Undiagnosed 

totals 



(43%) 
(13X) 



0%) 
OX) 
( 0%) 



4 (57X)' 
25 (83%) 

( 0%) 
(20%) 



1 
0 
1 



31 



Buggy 

0 ( 01) 

1 ( 35C) 
14 (78%) 

1 (14%) 

0 ( OX) 

16 



Tmkering Undiagnosed totals 



( Ot) 
( OX) 
(IIX) 
(71%) 
( 0%) 



( 0%) 
( 0%) 
( 6%) 
(14%) 
(80%) 



1 ( 100*) . 
30 ( 100%) 

18 (ioa%) 

7 (100%) 
5(100%) 

67 



Most of the switching is between the Correct and Slips categories, as expected by the assumed in 
instability of slips, and between the Buggy and Tmkering' categories, as predicted by bug/tinkering ^ 
migration. . 



) Table 5.4: Short-term stability of bugs and impasses. There were 36 cases where the student was 
diagnosed as having the perfonniiig the corrtct procedure on both tests,' with perhaps some slips 
each time. There were 4 cases where the smdent was undiagnosable on both tests. The other 37 
cases are prcs^ted on the table which begins on the next page. The^^are separated into four 
categories: stable bugs, stable impasses,- unstable bugs and unstable impaises. Tinkering is notated 
with parenthesized lists of the form 

(<impasge><repair><repair>< repair>) 
where <impasses> is the name of the impasses and <repair> is the juune of one of the repairs being 
used to get past t)ie impasse. The repairs are docimicnted in (Brown & VanLehn, 1980). The 
impasses used here are: % 



T-O/BF 

T-l/BF 

T-OO/BF 

T-OAB/BF 

T-B/BF 

T-O/SC 

T»OAB/SC 

T-B/SC 

T-OAB&T-BBB/SC 

^ ANS/OVERFLOW 
DECR/TWICE 



Can't borrow from lero. 

Can't borrow from* onc- 

Can't borrow froin multq;)le zerps. 

Can't borrow from one's that have been decremented to zero. 

Can't borrow from a coliimn where die top and bottom digits are e^uaL 

Can't process a column with a zero on top. 

Can't process' a column with a top zero created by decrementing a one. 

Can't process a colunm where the top and bottom digits are equaL 

Can't process a column with a top zero created by decrementing a one, 
whose top and bottom digits were equal before the one was decremented. 

Can't write two digits for a cokmn answer. 

Can't borrow from a digit that's? been borrowed from already. 



i 
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Stable bugs: 
p 
O 



(•STOPS/K>BROW/AT/MULTIPtE/ZERO} becoMt 
(•STOPS/BORROW/Af/MULTIPLE/ZERO) ^ 
(•OIFF/N-0-0 •SMALLER/FROM/LARGER *Dirr/0-N-0) bacofltS 
(•DIFF/N-0-0 •SMALLER/FROM/LARGER •DlFF/O-M-0) 
(•DECREMENT /MULTIPLE/ZEROS/BY/NUMBER/TO/RIGHT j becomtf 
( •OECREMENT/MULTIPLE/ZEROS/BY/NUHBER/TO/RIGHT) 



Stable impasses with s(lablcbugs(n.b. impasses with just one repair are bugs): 



O 
O 
O 
O 
O 
6 

O 

d 

O 
O 
O 



(•STOPS/BORROW/AT/ZERO •O-M-O/EXCEPT/AFTER/BORROW ({T-OAB/SC IGNORE OEMEHOIZE)) 
b«com«s 

STOPS/BORROW/AT/ZERO •0-'M«0/EXCEPT/AFTER/BORROW •O-N-O/AFTER/BORROW) 
T-OO/BF IGNORE/S^ELF WEIRD}} b9Com9t 
T=00/BF IGNORE FSELF WEIRD}} 
T+1«B/BI WEIRD IGNORE} > becomes 

T+1*B/BI WEIRD IGNORE)) ^ ' . ^ 

BORROW/ACROSS/ZERO (T-O/SC DEMEMOIZE RErOCUS/VERTlCALLY) } OecOfMf 
BORROW/ACROSS/ZERO (T»0/SC DEMEMOIZE REFOQUS/VERTICALLY) } 
BORROW/ACROSS/ZERO (T-O/SC ^IGNORE D'EMEMOIZE REFOCUS/VERTICALLY) } 
BORROW/ACROSS/ZERO fT>0/SC IGNORE DEMEMOIZE REFOCUS/VERTICALLiy } . 
T-OAB/SC IGNORE NOOP}} becomes ' " 
T«OAB/SC IGNORE NOOP)) . ' / 
BORROW/FROM/BOTTOM/INSTEADOF/ZERO •0-M^N/EXCEPT/AFTER/B0RRO¥ 
T*OAB&T-BBB/SC IGNORE DEMEMOIZE}} becomes / 
BORROW/FROM/BOTTOM/INSTEADOF/ZERO •0-N-N/EXCEPT/AFTER/BORRW 
r-OAB&T-BBB/SC IGNORE MOOP DEMEMOIZE}} . / 

0-N»N/EXCEPT/AFTER/BORROW (T-OAB&T-BBB/SC QUIT/THE/TEST })/becomes 
0-N-N/EXCEPT/>IFTER/BORROW "I-I-O/AFTER/BORROW} 
DON'T/DECREMENT/ZERO ( ANS/OVERFLOW IGNORE NOOP}} becomes . 
DON*T/DECREMENT/ZERO lONLY/WRITE/UNITS/DIGIT) 
T«0/BF REFOCUS/LEfT BACKUP-REFOCUS/VERTICALLY} } fcom^t 
BORROW/ACROSS/ZtRO} 

SIMPLE/PROBLEM/STUTTER/SUBTRACT} b9Com9$ , 
SIMPLE/MULTIPLICMION/PROBLEM IGNORE WEIRD}) 
DIFF/O-N-N (T-O/B^NOOP BACKUP-REFOCUS/VERTICALLY}} hfcom^t' 
•DIFF/O-N-N "StpPS/BORROW/ AT/ZERO) . " . 
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Appearing and disappcarinl bugs: ; 

O (•DECREMEMT/ALL/OII/HULTIPLE/ZERO/EXCEPT/AFTER/BORROK 

•SIMPLE/PROBLEM/StUTtER/SUBTRACT) 6»C0«tf 

(•OECREMENT/ALL/OII/MULTIPLE/ZERO/EXCEPT/AFTER/BORROW) 
O CBORRW/ ACROSS/ZERO •SIMPLE/PROBLW/STUtTER/SUBTRACT (T-OAB/BF HOOP WEIRD)) tmeomM 

( •BORROW/ ACROSS/ZERO (T-OAB/BF HOOP QUIT)) 
O () sl1p« b9eom9 ^ , 

(•SMAUER/FR0H/LAR6ER) * ^ ' 

O (•BORROW/IGMORE/ZERO/OVER/BLAIIK) b#CO»tt 

() «iip« ' • ■ 

O (•1-1-0/AFTEWBORROW) b9ComU 
\ {) undiagnosed^ 

Appearing and disappearing impasses (and bilgs): ' 

O ((T-WbF MOOP REFOCUS/LEFT) (T-OAB^T-BBB/SC I6IK)RE DEMEHOIZE)) tf#COii«t 

((T-0/%F BORROW NOOP) (ANS/OVERFLOW NOOP ERASE&PARTIAt/REDO) ) 
O (•SIMPLE/PROBLEM/STUTTER/SUBTRACT) btCWMt 

((SIMPLE/MULTIPLICATION/PROBLEM IGWRE WEIRO)) ^ 
O (•DIFr/O-M-M ^STOPS/BQRRdW/AT/ZERO (T*B/BF IGBORE^BACKUP-REFOCUS/VERTICALLY) 

(T-BBB/SC REFOCBS/VERTICALLY OEMfMOIZE)! btcwitf % ^ 

(•DIFF/O-M-M •STOPS/BORROW/AT/ZERO •SMALLER/FROMyLARSER/WMEII/BORROWED/FROll) 
O (•OIFF/0-N*0/WHEM/BORROW/FROM/ZERO •BORROW/ACROSS/ZERO 
(T-OAB/SC DEMEMOIZE IGNORE WEIRD)) btCOMS 

((T-O/BF BORROW MOVL) (AMS/OVERFLOW FAOO) (T-OAB/SC IGNORE QUIT)) 
O (•DIFF/N-0-0 •SMALLER/FR0H/LAR8ER •DIFF/O-N-0) 6«eOM« 

t*fiORROW/ ACROSS/ZERO (T-OAB/BF NOOP)) 
O ( -BORROW/ ACROSS/iERO (DECR/TWICE NOOP IGNORE) (T-OABIrT-BBB/SC IGNORE NOOP)) 6«eoMS 

(•STQPS/BORROW/AT/ZERO) 
O ((T-l/BF NOOP AODIO IGNORE WEIRD)(T-0/BF FSELF NOOP IGNORE)) titcoett 

() Undiagnosed 




. 1 - 
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Appendix 6 
Long*tenn Stability 



Table 61: Long-ieim/stabUity by diagnosUc catcgoiy. ^ diows the diagnose classes bf both 
teste of the 154 subjttti who wire teisted twice during the Southbayv study. 84 of the 154 students 
(or 55%) stayefl m the same diagnostic category* 



Slips 




Second T«t 

Bugs Undiagnosed 



4 (67X) 

5 ( 8X) 
13 (15X) 



0, 0 

2 (33%) 0 ( OX) 
34 (53%) ^24 (38%) 
22 <33X) 46 (55%) 



58 



70 



totals 



-0 . 
6 (100%) 
64 (100%) 
84 (100%) 



154 



Roi/ghly the same proportion of students switched from Buggy to Undiagnosed as from' 
Undiagnosed to Buggy. Neither category contributed significarttly to die Correct category^ These 
two facts tend to confound die .hypodiesis diat Buggy students are more often remediated by the 
current curriculum than Undiasnosed students. , , 
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Table 62: Loiig*tenn stabili^ by bug. The diagnoses for boA tests are showa for tfap 34 studencs 
who were Systematic on bcitb tests. 17 students hod rouglily the same diagnosis on both tei5ts» 
indicating^ that thesa bi^ can be problems* The diagnoses of die other 17 alKlents did 

not overlap. Of these, 13 subjects 'showed evidoKC of lea^g in thcir^bugs in that the sub^ 
dicir first bug(s) coiiid not amHnpliSh has been ma^sered, but a more advamced subsldll is missing 
leading to a second bug(s). 



Sec^hiii tesfs diagnosis is the same as or overlaps^^^ 

^ ■■ ' ■ p ■ ■ . ^v-- ■ ■ • 

O (>SHAI:£ER/FROM/UUMiEil) 6«C0Mf $/ 

: (•$MAiUA/FR0N/LAR6ER) * 
O (<^$MALLER/FA0N/LAR6ER) 6«C0Mf 

^ («SMALLER/FR0M/LAR6ER) ' 
O (rSMM.tER/FR0M/LAR6ER> 69COMS 

. {[•SlULUR/FROM/LARGER) 
O (?SMALUR/FR0M/LAR6ER) 6«C0MS 

(•SMALLER/FROM/LARGER) 
O (•SMALIER/FRON/LARGER) fGOms 

(•SMALLER/FBOM/LARGER) / 
O C*SMALL^R/FROM/LARGER) 6«COMff ' 

(>SNALLE:|1/FR0M/LAR6ER) > 
O (>,Sl«ALLEj^/FROII/UU»GER) 6«eoMS ^ ja. 

C^^LLER^fFROM/LARGER) j ^ 

(•isiiiktLER>fB<^ ^ ' • :■ 'V 

O (•QumimER/^ou 

(•SMALLER/FrtbM/LARGER) ' 
JO (•QUIT/WHEM/BOTTON/BLANK •SMALLER/FROM/LARGER) 6tC0Mf 
' (•QUIT/WtEN/BOTTOM/BLANK •SMALLER/FROM/LARGER) 
O (•60RROW/DIFF/0-N«N&SNALL-LARGE«0) fttCOMt 

t ( •BORROW/DIFF/0*NHiiaStlALL*LARGE«0) 
O (•STOPS/BORROW/AT/ZERb •SHALt£R/FR0N/UUIGEI|/VHEN/90RR0tfED/FR0N) ft«eo««^ 

( •STOPS/BORROW/AT /ZERO) 
O ('isTOPS/BORROW/AT/ZERO •DXFF/0*li«ll) 6«C0Mf 

( •STOPS/BORROW/AT/ZERO) \ - 
O ( -BORROW/ AfiROSS/ZERO •DIFF/0-M»M) 6fC«i»» 

(•BORROW/ ACROSS/ZERO) ' ^ 

O ( "BORROW/ACROSS/ZERO •QUXT/WHEN/BOTTOM/BLANK •DXFF/0*ll«0) 6«C0M« 

( -BORROW/ ACROSS/ZERO •DJFF/O-N-N} 
O (•STOPS/BORROW/AT/ZERO •BORR0W/OMCE/THy/SMALLER/FROII/tJW|GER «DIFF/O-ll-ll) btCOMf 

(•STdPS/BORROW/AT/ZERO •l-l»l/AFtER/BOilWW) ' . ^ 

O (•STOPS/BORROW/AT/MULTXPLE/ZERO) 60COMr 

( lONLY/WRXTE/UNXTS/DXGXT •STOPS^BORROW/AT/HULTIPLE/ZERO 
•N-N/AFTER/BORROW/CAUSES/BORROW) - 



No'overlap between the two tests' diagnoses, V > 

O (•ADD/NQCXRRY/INSTEAOOF/SUB) btCOMS 

(•SNALLER/FiibM/L9kR6ER) , • 

O (•S|iiALLER/FR0M/LAR6ER)«b9C0MS / , . 

(•l)IFF/0-N>'N/WHEN/B0RR0W/FROM/ZER0 •DOM'T/pECREMENT/ZERD) 
O (•SNALLER/FROM/LARGER) b»COMS \< * 

* (•STOPS/BORROW/ AT/ZERD) ^ ^ - r*. , ' 

O (•SMALLER/FROM/LARGER) b»CO«tS ^ 

( •BORROW/FROM/BOTTOM/INSTEADOF/ZERO •OIFF/p-i*il) 
lb (•SMALLEA/FROM/LARGER) b»COMS ' 

(•BORROW/NO/DECREMENT •DIFF/O-H-II) < v 

O (*SMALLER/FBOM/LARGER) ft»COMS 

(•ALWA|^/BQRR(fW/LEFT}^ . 
O (•SMALLER/FROM/LARGER) btCOMfS 

( •BORROW/ACROSS/ZERO «DIFF/Q-N«fl/WHEN/B0RROW/FilOM/ZERO 
' •SHALUR/FROM/LARGER/WHEN/BORROtfEO/FROM) 
O (•SMALLER/FROM/LARGER) btCMtS . 

( *BORROW/NO/DECREHEirr) 
O (•SMALLER/FRG^/LARGER) b»CO*tS 

( •STOPS/BOBROW/AJ/ZERO *BORRDW/DNCE/THEN/SMALLER/FROM/LARGER) 
O (•BORROW/NO/DECREMENT/EXCEPT/LAST) fCOtm ^ 

(•STOPVBORROW/AT/ZEBO)' ^ 
O (•ALWAYS/BORROW/LEFT) btCOMS 

(•BORROW/INTO/ONE-TEN •STOPS/BORROW/AT /ZERO) 
O (•BORROW/DON*T/DtCREMENT/UNLESS/BOTtOM/^LLER) btCMM 

(•BORROW/INTO/QNE-TEN •STOPS/BORROW/AT/ZERO) 
O (*B0RR0W/FROM/ZER0) b»CO«M ^ 

(•STOPS/BORROW/AT/MULTIPLE/ZERO) 

No overlap between diagnoses, no evidence of learning: 

O <*B0RR0W/AGR0SS7ZER0 •O-il-ll/EXCEPT/A^TER/BDRROW) b»COiW« 

.(•DOWT/DECREMEIIT/ZERO) v » 

O ,(*FORGET/BORROW/OVER/BLANICS •BORROW/DON'T/DECREMEMT/TOP/^LLER •BORROW/SKIP/EQUAL) 

becomes (•BORROW/NO/DECREMENT) , * , 

O (•B0RR6W/ACR0SS/ZER0 "FORGET /BORROW/OVER/BLAMKS • •DIFF/O-N-M) b»COMS 

(.•DON *T/WRITE/ZERO •BORROW/IGNORE/ZERO/OVER/BLANK) 
O . (•BORROW/NO/DECREMENT •SUB/ONE/OVER/BLAMK) b0eom0$ 

(•BORROW/DON*T/DECREMENT/UNLESS^BOTTDM/SMALLER) 
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William. J, McLaurin 

66610 Howie Court 

Camp Springs, MD 20031 

Dr. Arthur Melmed 
National Intitute of Education 
.1200 19th Street NW 
V/ashington, DC 20208 

Dr. Andrew'R. Molnar 
Science. Education Dev. 

and Research 
National Science Foundation 
Washington, DC 20550 

Dr . Joseph Psotka 

National Institute of Education 

1200 19th St . NW 

Washington, DC 20208 

Dr. H. V/allace Sinaiko 
Program Director' 

Manpower Research and Advisory Services 
Sm i t hson i an In st i t ut i on 
801 North Pitt Street 
Alexandria, VA. 22314 . ■ 

A. 

Dr . Frank WjL throw 

U. S. Office'of Education \ 

400 Maryland Ave. SW ^ v , 

Washington, DC 20202 

Dr. Joseph L. . Young', Director 
Memory & Cognitive Processes 
National Science Foundation . 
Washington, DC 20550 
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•Dr. Erling E; ji.ndersen 
Department of Statistics 
Studiestraede 6 
^^55 Copenhagen 

DENMARK . . ^ 

Dr. John R. Anderson 
Department of Psychology 
^Carnegie Mellon University 
Pittsburgh, PA' 15213 

Anderson, Thomas H., Ph.D. 
(;:enter for the Study of Reading 
17^ Children's Research Center 
51 Gerty Dr^ive 
Chanpisgn, IL 61820 

Dr. John Anhett 
Department of Psychology 
University of Warwick 
Coventry CV^J 7AL 
ENGLAND 

DR. MICHAEL AJ^OOD 
SCIENCE APPLICATIONS INSTITUTE' 
no DENVER TECH. CENTER '.V.'EST 
7935 E. PRENTICE AVENUE V 
EN6LEW00D, CO , 80110 

1 psychological research unit 
Dept. of Defense (Army Office) 
Campbell Psrk Offices 
Canberra ACT 2600, Australia 

Dr. Alan Daddeley 
Medical Research Council 

Applied Psychology Unit 
15 Chaucer Road 
Cambridge CB2 2EF 
ENGLAND 

Dr. Patriaia Baggett 
Department of Psychology . . 
University of Colorado. 
Boulder, CO 8O309 



Mr Avron Barr 

Department, of Computer, Scienc^a 
Stanford University " V. 
Stanford, CA 9^*305 

Liaisdn Scientists 
Office of Naval Research, 
Branch Office ., London 
' Box .39 FPO New York 09510 

Dr. Lyie Bourn^ 
Department of P^chology 
University of* Colorado . 
Boulder, CO 8O309 



Dr. John S. Brown 
XEROX Palo Alto Re^ 
333?^' Coyotde Road ^ 
Palo Alto, CA 9^3011 



arch Center 



1 -Dr. ^Bruce Buchanan 

Departthent of Computer Science 
Stanford University 
Stanford, CA 9^'305 

1 DR. C. VICTOR BUNDER^ON 
WICAT INC. 

UNIVERSITY PLAZA, SUITE 10 
' 1160 SO. STATE ST. 
OREM, UT 8^057 

1 Dr. Pat Ckrpenter 

Departmet7t of Psychology 
Carnegie-Mellon 'University 
Pittsburgh, PA 15?.1,? 

1 Dr. -John B. Carroll 
Psychometric Lab 
Univ. of No. Carolina 
Davie Hall 01 3A 
Chapel Hill,^NC 2751^ 

1 'Charles Myers Library 
Livingstone House 
- - Livingstone Road 

Stratford " • 

London E15 2LJ 

ENGLAND 
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/Dr, V.'illirm Chasfe* v*. • 
.D(|partrnent of >1*sychol(>gy' ^ 
CarnegVe Mellron' University 
Pittsburgh, PA 15213 . ^ 

Dr • Michel ine Chi 
Learning R&D Center 
University of Pittsburgh 
3^39 O'Hara Street 
Pittsburgh, PA 15? 13 

Dr, A'/illiain Clancey 
Department of Computer Science 
Stanford University 
Stanford, CA 9^^305 

Dr., All all M,. Collins 
Bolt Beranek & Newman, Tnc. 
SO Moulton Street 
Cambridge, Ma ^ 02138 

Dr, Lynn A. Cooper 
LRDC 

University of Pittsburgh 
^ 3939 O'Hara Street 
^ Pittsburgh, PA 15213 

Dr, Meredith .p. Crav/ford 
American Psychological Association 
1200 17th Street » NsW, 
Washington, DC 2003?* 

LCOL J, C, Eggenberger 
DIRECTORATE OF .PERSONNEL APPLIED RESEARC 
NATIONAL DEFENCE HQ 
101 COLONEL BY DRIVE 
OTTAWA, CANADA K1A OK 2 

Dr. Ed Feigenbaum 
Department of Computer Science 
Stanford University ' . 

Stanford, CA 9^309^ - 

Mr. V/allace Feurzeig— ^ 
Bolt Beran'pk & Newman, Inc, 
50 Moulton St, 
Cambridge, MA 021,38 . 



Non .Govt 



1-^ DrVi Vic tor' Fields .^■ . - v ; ' 
> Dept. of Psycho iogy ' . at, ' 

Montgomery College 

Rockviile, Mb 20P50 
I 

1 Dr. John R, Frederi'4<sen . 
Bolt Beranek ^ Newman 
50 Moulton Street 
■ Cambridge, MA '021?e 

1 Dr". Alinda Friedman 

Department of Psychology 
University of Alberta 
Edmonton, Alberta 
CANADA T6G 2E9 ' 

1 Dr. R. Edward Geiselman 
Department of Psycholbgy 
University of California 
Los Angeles, CA 9002^^ 

1 DR. ROBERT GLASER 
LRDC 

UNIVERSITY OF PITTSBURGH 
3939 O'HARA STREET 
PITTSBURGH, PA 15213 

1 Dr. Marvin D. Clock 
' 217 Stone Hall ^ . 
Cornell University 
Ithaca, ^NY 1'i853 ..^ 

1 Dr. Daniel Gopher 'V 

Industrial & Management Engineering 
Technion-Israel Institute of Technology 
Haifa 
ISRAEL 

1 . Dr. Harold Hawkins 

Department of Psychology 
University of Oregon 
Eugene OR 97^03 

1 Dr. Barbara Hayes-Roth 
The R^nd Corporation 
1700 Main Street 
Santa Monica, CA 90i\0f^ 
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Non Govt 



Non Govt" 



Dr. Frederick -Hayes-Roth 
The Rand Corporation 
1700 Main Street 
Santa Monica, CA 90^06 

Dr. James R. Hoffman 
Department of Psychology 
University of Delaware 
Newark, DE 19711 

Dr. Kristine Hooper • • 
Clark Kerr Hall 
University of California 
Santa Cruz, CA 95060 

Glenda Greenwald, Ed. 

"Human Intelligence Newsletter" 

P. 0. Box 1163 

Birmingham, MT ^48012 

Dr. Earl Hunt 
Dept . of Psychology 
University of V/ashington 
Seattle, WA 9?105 

Dr. Ed Hutchins 

Navy Personnel R&D Center 

San\Diego, CA 9^^152 

Dr. Steven V/. Keele 
Dept. of Psychology 
Univera^ty of Oregon 
Eugene',' OR 97^03 

Dr. Walter Kintsch 
Department of Psychology 
University of Colorado 
Boulder, CO 80302 

Dr. David Kieras 
Department of Psychology 
University of Arizona ' 
Tuscon, A7 85721 

Dr.. Kenneth A. Kliving.t6n' - 
Program' Officer v / 
Alfred P. ^Sloan Foundation 
630 Fifth Avenue 
New York, NY 101 11 



Dr. Stfphen Kossiyn 

Harvard University . 

Department of Psychology 

33 Kirkland Streej-. . \ 

Cambridge, MA 02138 , ,= 

.Dr. Marcy Lansman — 
Department of Psychology, NT 25 
University of A^ashingtpn 
Seattle, V/A 98195 

Dr. Jill Lark'in 
Department of ^ychology 
Carnegie Mellon University 
Pittsburgh, PA 15213 

Dr . Alan Lesgold - 
Learning R&D Center 
University of Pittsburgh 
Pittsburgh, PA 15260 

Dr. Michael Levine ' ' 

Department of Educational Psychology 
210 Education Bldg. 
University of Illinois 
Champaign, tL'f 1801 

Dr . Robert Linn ' 
College of Education 
University of Tllino'^Ls , 
Urbana, IL 61801 

Dr. Erik McWilliams ^ 
Science Education Dev. and Research 
'National Science Foundation 
Washington, DC 20550 

Dr* Mark Miller * 
TI Computer Science Lab 
C/0 282M Winterplace Circle 
Piano, 'TX 75075 

Dr. Allen Munro 

Behavioral Technology Laboratories 
18^5 Elena Ave. , ^Fourth Floor 
Redondo Peach, CA 90277 
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Non Govt 



Hon Govt 



Dr. Donald A Norman 
Dept. of Psychology C-009 
Univ. of California, j San Diego ^ 
La Jolla/ CA 92093 ' 

Dr. Seymour A. Papert 

Massachusetts Institute of Technology 

Artificial Intellig(?nce Lab 
.*545 . Technology Square 
^ Cambridge^ MA 02139 . 

Dr. James A. Paulson " ^' ■ \ 
Portland Stotp- University ' 
P.O. ,Box '^51 

Portland, OR 97207 ^ ' 

< * ♦ 

Dr. Ja^es W. Pollegrino 
University of California, ^ 

Santa Barbara .'^ 
Dept. of Psychology 
Santa Barabara, CA 



93106 



MR. LUIGI TETRULLO 

24?1 N. EDGEWOOD STREET . ' . 

ARLINGTON, VA 22207 . 

Dr. Martha Poison 
Department of Psychology . 
Campus Box ''^^ 
University of Colorado 
Boulder, CO 80309 ■ 

DR. PETER POLSON ^ 
DEPT. OF PSYCHOLOGX 
UNIVERSITY OF COLORADO 
BOULDER, CO 80309 . ' 

Dr. Steven E. Poltrock . 
Department . of Psychology 
University of Denver r 
Denver, CO 80208 

MINRAT M. L. RAUCH 
P II ^ 

BUNDESMINISTERIUM DER VERTEIDIGDNG- 

.POSTFACH 1328 

.D-.53 BONN 1 , GERMANY 



Dr .^^ Fred Re\f 
ijSESAME • . ' 

c/o Physics Department p 
University of California ^ 
Berkely, CA 9M720 ^ 

Dr. Lauren Resnick • " 

LRDC " * ; 

University of Pittsburgh 

3939 O'Hara Street 

Pittsburgh^ PA 15213 

Mary Riley >. 
LRPC 

University of Pittsburgh 
^ 3939 O'Hara Street 
Pittsburgh, PA 15213 

Dr. Andrew. M. Rose \, - ' 
American Institutes for Research 
1055 Thomas Jefferson, St/ NW ' 
Washington, DC 20007 ' . . 

^Dr. Ernst Z. Rothkopf 

Bell Laboratories 
: fOQ Mountain Avenue 
Murray Hill, NJ 0797^ - 

Di/. David Rumelhart * 
Ctenter for Human Iriformatiorv" Processing 
Univ. of California, San Diego \ , ' 
^ Jolla, CA 92093' *^ , 

DR. V/ALTER SCHNEIDER 

DEPT. OF PSYCHOLOGY 

UNIVERSITY OF,, ILLINOIS 

CHAMPAIGN, .IL/,!^:!^820 . . 

■ ■ ■*■ ■■■■ \ . ■ •' ■ • 

Dr. Alan Schoenf^eld 

Department of Mathematics . ^ 

Hamilton College : - 

Clinton, nV n^J?r. 

Committee on Cognitive Reserrch 
■ Dr . Lonnie- R. Sherrod 
TiQcial Science Research (Council 
605 Third Avenue . • 

New York^Y 10016 < ■ ^• 
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Robert S. Siegler 
Associate Professor 
Carnegie-Mellon University" 
Department of Psychology 
Schenley Park 
Pittsburgh, PA 15213 

Dr. Edward E. Smith 
Bolt Beranok & Newman, Inc." 
5C Moulton Street 
Cambridge, MA 0;?138 

Dr. Robert Smith 
Departrnent of Computer Science 
Rutgers University * 
New Brunswick, KJ OROO-? 

Dr. Richard Snow 
School of Education 
Stanford University ^; 
St&nford- CA 9^305 

Dr. Robert Sternberg 

Dept. of Psychology 

Yale University = ' / 

Box 11A, Yale Station 

New Haven , CT 06520 ' - 

. DR. ALBERT ST^ENS 

BOLT BERANEK h NEV/MAN-, INC. 

50 MOULTON STREET * 

CAMBRIDGE', MA. 02138 / 
' ■ ■- * 

^David E. Stone Ph .D. 

Hazeljine Corporation ^ ^ 

■7680 Old Springhouse Road 

McLean, VA 2210^ • 

DR. PATRICK SUPPES ^ 
INSTITUTE FOR MATHEMATICAL STUDIES 

THE SC5CIAL . SCIENCES 
STANFORD UNIVEE^TY 
STANFORD," CA ^'^|p05 



1 Dr. Kikumi Tatsuoka 

Computer Eased Education Research 

Laboratory 
252 Engineering Research Laboratory ^ 
, University of Illinois 
Urbana , IL 61801 r - ' ' 

■ 1 ■ Dr. John Thomas 

IBM Thomas J. Watson Research Center 
P.O. Box 218 " 
Yorktown Heights, NY 1059? 

1 DR. PERRY THOR^}DYKE , . ■ . 

THE RAND CORPORATION- 
1700 MAIN STREET 
SANTA MONiqA, CA ^90406 

1 Dr. Douglas Towne 

Upiv. of So. California 
Behavioral Technology Labs 
18*15 S. Elena Ave. 
Redondo Beach, CA 90277 * 

1 .Dr. J. Uhlaner 

• Percep*-,ronics, Inc. ,/ 
6271 Vari'el Avenue • 
. • Woodland Hills:, GA 9136^ ' 

1 Dr. Benton J. J^nderwood 
Dept. of Psychology 
o Northwestern University 
.Evanston, IL 60201; ^ ^ 

1 -Dr. Phyllis W^ve^ ' . - 
Graduate School of Education 
^ . . Harvard University 

200 Larsen Hall, Appian VJay 
Cambridge, MA 02138 

IN ^W^Dr. David J. VJeiss 

N660 Elliott Hall 
^ University of Minnesota . 

75 E^iver Road ' 
Minn^a'riJ^lis, MN- 55*^55 

1^ DR. \}ERSHON WELTMAN 
• PERC^PTRONICS INC.. 
6271 VARIEL AVE.' 
"s?- WOODLAND HILLS, CA 91367- 
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1 Dr. Keith T. 'V/escourt 

Information Scyiences Dept. 
The Rand Corporation 
1700 Main St. 
Santa Monica, . CA 9P^06 

1 DR. SUSAN E. WHr TLY 
PSYCHOLOGY DEPARTMENT 
UNIVERSITY OF KANSAS 




